Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-13 Thread Jeff Law
On 07/13/2017 06:37 PM, David Malcolm wrote:
> On Tue, 2017-07-11 at 15:19 -0600, Jeff Law wrote:
> 
> [...]
> 
>> diff --git a/gcc/opts.c b/gcc/opts.c
>> index 7460c2b..61f5bb0 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -2243,6 +2243,19 @@ common_handle_option (struct gcc_options
>> *opts,
>> opts->x_flag_stack_check = STACK_CHECK_BUILTIN
>>? FULL_BUILTIN_STACK_CHECK
>>: GENERIC_STACK_CHECK;
>> +  else if (!strcmp (arg, "clash"))
>> +   {
>> + /* This is the stack checking method, designed to prevent
>> +stack-clash attacks.  */
>> + if (!STACK_GROWS_DOWNWARD)
>> +   sorry ("-fstack-check=clash not implemented on this
>> target");
> 
> Minor nitpick: shouldn't options be quoted in diagnostics?
> 
> So this should be:
> 
>   sorry ("%<-fstack-check=clash%> not implemented on this target");
> 
> (or whatever it ends up being called)
Thanks.  Fixed for the next version.

jeff


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-13 Thread Jeff Law
On 07/13/2017 05:37 PM, Segher Boessenkool wrote:
> On Thu, Jul 13, 2017 at 04:38:00PM -0600, Jeff Law wrote:
>> On 07/13/2017 03:32 PM, Segher Boessenkool wrote:
> -fstack-check=clash is itself not such a super name either.  It's not
> checking stack, and it isn't clashing: it just does a store to every
> page the stack will touch (minus the first and last or something).
 Yea.  I don't particularly like it either.  Suggestions?  I considered
 "probe" as well, but "specific" also does probing.  In the end I used
 the part of the marketing name of the exploits.
>>>
>>> I don't think it should be inside -fstack-check at all.  Sure, the
>>> mechanisms implementing it overlap a bit (more on some targets, less
>>> on others), but how will a user ask for clash protection _and_ for
>>> stack checking?
>> The biggest problem with separating them is we would end up with a fair
>> amount code that ultimately looks like
>>
>> if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
>> || flag_whatever_the_new_thing_is)
>>   {
>> probe the stack
>>   }
> 
> But only in backends, right?  And some backends will be actually
> simpler, afaics.
I suspect we might end up with one in explow.c as well.  But yes, it'd
primarily be buried in the targets.


>> Yup.  It gets baked in even faster than normal in this case I suspect.
>> Red Hat will almost certainly pick up the bits and start backporting
>> them to RHEL 7, RHEL 6 and RHEL 5.  So the flag will be out in the wild
>> before gcc-8 hits the streets.
> 
> Are you planning to backport it for mainline GCC as well?
The hope is we reach a consensus for mainline GCC, then we backport
whatever goes into mainline to the relevant RHEL releases.  THe
backports into RHEL would likely happen before gcc-8 hits the streets.

I've got a little time before I have to invert that process and go into
RHEL first.   I *really* want to avoid that for a multitude of reasons.


> 
>> There's -fstack-check and -fstack-clash-protection.  I think with the
>> direction we're going they are fundamentally incompatible because
>> neither the compiler nor kernel do anything to guarantee enough stack is
>> available after hitting the guard for -fstack-clash-protection.
> 
> Hrm, I have to think about that.
Even if the kernel implements the reserved page stuff mentioned earlier
in the thread, I'm not sure when we'd be able to reliably depend on that
capability (or even check for it).

ISTM that -fstack-check continues to be Ada centric and we have a new
option to deal with stack-clash protection and the two simply just
aren't allowed to be enabled together.


> 
>> And there's shrink wrapping.  This was originally raised by the aarch64
>> guys and deserves some thought.  I'm particularly worried about aarch64
>> if it was to shrink-wrap some particular register saves that we wanted
>> to use as an implicit probe.
> 
> For normal shrink-wrapping, the volatile reg stores will always happen
> in the same prologue that sets up the stack frame as well, so there
> won't be any problem (except the usual stack ties you might need in the
> prologue, etc.) (*)  For separate shrink-wrapping, yeah you'll need any
> bb that could potentially store to the stack require the component for
> the register you use as implicit probe.  This is pretty nasty.
I'm going to try and look at this tomorrow to get a feel for the aarch64
implementation separate shrink wrapping.

For ppc I'm not too worried -- the implicit backchain probes are so
effective at eliminating explicit probes that I didn't bother writing
any code to track the register saves as implicit probes.   I'm pretty
sure it just works on ppc with separate shrink wrapping.

> 
 We created that alloca'd object at the wrong lexical scope which mucked
 up its expected persistence.  I'm sure I'd spot it trivially once I set
 up the test again.
>>>
>>> That sounds like it might be a bug even.
>> That was my thought as well.  It's c-torture/execute/20071029-1.c.
> 
> Ah cool, thanks for digging it up.
NP.  As expected, it was trivial to slam in the initializer and rerun
the testsuite.  As soon as I saw the failure and looked at the source I
knew I'd found it again.

The other thing to remember about -fstack-check=generic is that once
your frame gets big, it just gives up.  That's why it takes large
automatic objects and turns them into alloca'd objects -- to reduce the
size of the prologue allocated frame.

I looked at -fstack-check=generic just long enough to realize it wasn't
really an option to cover s390.  Then I promptly moved onto more useful
pursuits.




> (*) Related, I know you don't want to scan generated code to see if
> all probes are in place, but it seems you pretty much have to if you
> want to make sure no later pass deletes the probes.  Well, something
> for targets to worry about :-)
Right.   The dumping that's currently done just gives us a view into
what the prologue code wanted to emit.  Passes after prologue 

Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-13 Thread David Malcolm
On Tue, 2017-07-11 at 15:19 -0600, Jeff Law wrote:

[...]

> diff --git a/gcc/opts.c b/gcc/opts.c
> index 7460c2b..61f5bb0 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -2243,6 +2243,19 @@ common_handle_option (struct gcc_options
> *opts,
> opts->x_flag_stack_check = STACK_CHECK_BUILTIN
>? FULL_BUILTIN_STACK_CHECK
>: GENERIC_STACK_CHECK;
> +  else if (!strcmp (arg, "clash"))
> +   {
> + /* This is the stack checking method, designed to prevent
> +stack-clash attacks.  */
> + if (!STACK_GROWS_DOWNWARD)
> +   sorry ("-fstack-check=clash not implemented on this
> target");

Minor nitpick: shouldn't options be quoted in diagnostics?

So this should be:

  sorry ("%<-fstack-check=clash%> not implemented on this target");

(or whatever it ends up being called)

[...]


[PATCH/AARCH64] Decimal floating point support for AARCH64

2017-07-13 Thread Andrew Pinski
Hi,
  This patch adds Decimal floating point support to aarch64.  It is
the base support in that since there is no hardware support for DFP,
it just defines the ABI.  The ABI I chose is that _Decimal32 is
treated like float, _Decimal64 is treated like double and _Decimal128
is treated like long double.  In that they are passed via the floating
registers (sN, dN, qN).
Is this ok an ABI?

Is the patch ok?  Bootstrapped and tested on aarch64-linux-gnu with
--enable-decimal-float with no regressions and all of the dfp
testcases pass.

Thanks,
Andrew Pinski

gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_split_128bit_move): Handle TDmode.
(aarch64_classify_address): Likewise.
(aarch64_legitimize_address_displacement): Likewise.
(aarch64_legitimize_address): Likewise.
(aarch64_constant_pool_reload_icode): Handle SD, DD, and TD modes.
(aarch64_secondary_reload): Handle TDmode.
(aarch64_valid_floating_const): For decimal floating point return false.
(aarch64_gimplify_va_arg_expr): Handle SD, DD, and TD modes.
(aapcs_vfp_sub_candidate): Likewise.
(aarch64_vfp_is_call_or_return_candidate): Handle MODE_DECIMAL_FLOAT.
(aarch64_scalar_mode_supported_p): For DECIMAL_FLOAT_MODE_P, return
default_decimal_float_supported_p.
* config/aarch64/iterators.md (GPF_TF_F16): Add SD, DD, and TD modes.
(SFD): New iterator.
(DFD): New iterator.
(TFD): New iterator.
(GPF_TF): Add SD, DD, and TD modes.
(TX): Add TD mode.
* config/aarch64/aarch64.md (*movsf_aarch64): Use SFD iterator.
(*movdf_aarch64): Use DFD iterator.
(*movtf_aarch64): Use TFD iterator.
(define_split for TF): Use TFD iterator.


gcc/testsuite/ChangeLog:
* c-c++-common/dfp/pr39986.c: Allow for word instead of just long.

libgcc/ChangeLog:
* config.host (aarch64*-*-elf): Add t-dfprules to tmake_file.
(aarch64*-*-freebsd*): Likewise.
(aarch64*-*-linux*): Likewise.
Index: config/aarch64/aarch64.c
===
--- config/aarch64/aarch64.c(revision 250186)
+++ config/aarch64/aarch64.c(working copy)
@@ -1653,7 +1653,7 @@ aarch64_split_128bit_move (rtx dst, rtx
 
   machine_mode mode = GET_MODE (dst);
 
-  gcc_assert (mode == TImode || mode == TFmode);
+  gcc_assert (mode == TImode || mode == TFmode || mode == TDmode);
   gcc_assert (!(side_effects_p (src) || side_effects_p (dst)));
   gcc_assert (mode == GET_MODE (src) || GET_MODE (src) == VOIDmode);
 
@@ -1673,11 +1673,16 @@ aarch64_split_128bit_move (rtx dst, rtx
  emit_insn (gen_aarch64_movtilow_di (dst, src_lo));
  emit_insn (gen_aarch64_movtihigh_di (dst, src_hi));
}
- else
+ else if (mode == TFmode)
{
  emit_insn (gen_aarch64_movtflow_di (dst, src_lo));
  emit_insn (gen_aarch64_movtfhigh_di (dst, src_hi));
}
+ else
+   {
+ emit_insn (gen_aarch64_movtdlow_di (dst, src_lo));
+ emit_insn (gen_aarch64_movtdhigh_di (dst, src_hi));
+   }
  return;
}
   else if (GP_REGNUM_P (dst_regno) && FP_REGNUM_P (src_regno))
@@ -1690,11 +1695,16 @@ aarch64_split_128bit_move (rtx dst, rtx
  emit_insn (gen_aarch64_movdi_tilow (dst_lo, src));
  emit_insn (gen_aarch64_movdi_tihigh (dst_hi, src));
}
- else
+ else if (mode == TFmode)
{
  emit_insn (gen_aarch64_movdi_tflow (dst_lo, src));
  emit_insn (gen_aarch64_movdi_tfhigh (dst_hi, src));
}
+ else if (mode == TDmode)
+   {
+ emit_insn (gen_aarch64_movdi_tdlow (dst_lo, src));
+ emit_insn (gen_aarch64_movdi_tdhigh (dst_hi, src));
+   }
  return;
}
 }
@@ -4420,10 +4430,11 @@ aarch64_classify_address (struct aarch64
   rtx op0, op1;
 
   /* On BE, we use load/store pair for all large int mode load/stores.
- TI/TFmode may also use a load/store pair.  */
+ TI/TF/TDmode may also use a load/store pair.  */
   bool load_store_pair_p = (outer_code == PARALLEL
|| mode == TImode
|| mode == TFmode
+   || mode == TDmode
|| (BYTES_BIG_ENDIAN
&& aarch64_vect_struct_mode_p (mode)));
 
@@ -4473,7 +4484,7 @@ aarch64_classify_address (struct aarch64
  info->base = op0;
  info->offset = op1;
 
- /* TImode and TFmode values are allowed in both pairs of X
+ /* TImode and TFmode and TDmode values are allowed in both pairs of X
 registers and individual Q registers.  The available
 address modes are:
 X,X: 7-bit signed scaled offset
@@ -4482,7 +4493,7 @@ aarch64_classify_address (struct aarch64
 When performing the check for pairs of X registers i.e.  LDP/STP
 pass down DImode since that is the natural size of the LDP/STP
 instruction memory accesses.  */
- if 

Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Segher Boessenkool
On Thu, Jul 13, 2017 at 05:10:33PM -0600, Jeff Law wrote:
>    2. ABI mandates that *sp always contain a backchain pointer (ppc)
> >>>
> >>> In the ELFv2 ABI a backchain is not required.  GCC still always has
> >>> one afaik.  I'll find out more.
> >> Please do.  I was under the impression it was mandated by the earlier
> >> ABIs as well.  If it isn't, then I don't think we can depend on it for
> >> the older ABIs.
> > 
> > I checked most ABIs, and all but ELFv2 require it.  You can assume we
> > require it everywhere (we do assume it currently, and there is no
> > intention to change this).  The statement in the ABI surprised me
> > yesterday, sorry for panicking.
> Y'all are the experts here.  It would be advisable to get the ABI
> documents tweaked if indeed we are going to rely on the existence of the
> backchain as an implicit probe.

Yes, we'll deal with whatever is needed here, don't worry :-)


Segher


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-13 Thread Segher Boessenkool
On Thu, Jul 13, 2017 at 04:38:00PM -0600, Jeff Law wrote:
> On 07/13/2017 03:32 PM, Segher Boessenkool wrote:
> >>> -fstack-check=clash is itself not such a super name either.  It's not
> >>> checking stack, and it isn't clashing: it just does a store to every
> >>> page the stack will touch (minus the first and last or something).
> >> Yea.  I don't particularly like it either.  Suggestions?  I considered
> >> "probe" as well, but "specific" also does probing.  In the end I used
> >> the part of the marketing name of the exploits.
> > 
> > I don't think it should be inside -fstack-check at all.  Sure, the
> > mechanisms implementing it overlap a bit (more on some targets, less
> > on others), but how will a user ask for clash protection _and_ for
> > stack checking?
> The biggest problem with separating them is we would end up with a fair
> amount code that ultimately looks like
> 
> if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
> || flag_whatever_the_new_thing_is)
>   {
> probe the stack
>   }

But only in backends, right?  And some backends will be actually
simpler, afaics.

> We can certainly do that.  It'll touch a few more places (particularly
> in backends that have checks for Ada stack overflow, but for which I
> haven't written stack clash protection), but the changes would be simple
> and repetitive.

Sounds good :-)

> >> Certainly open to ideas on the interface aspects.
> > 
> > The interface is much harder to change than any aspect of the GCC
> > implementation.  Have to get it right at once, almost.
> Yup.  It gets backed in even faster than normal in this case I suspect.
> Red Hat will almost certainly pick up the bits and start backporting
> them to RHEL 7, RHEL 6 and RHEL 5.  So the flag will be out in the wild
> before gcc-8 hits the streets.

Are you planning to backport it for mainline GCC as well?

> >> It's independent of stack size overflow checking.  They could (in
> >> theory) even co-exist on ports that support stack size overflow
> >> checking, but I didn't test that.
> > 
> > Okay, that was my impression as well.  But that interface won't allow
> > it (unless every -fstack-check=X gets an -fstack-check=X+clash twin).
> THere may have been some mis-communication here, but it's a good place
> to spend some time thinking about interplay between options.
> 
> First, there's -fstack-limit-*.  Essentially there's an RTX which
> defines the limit for the stack pointer.  If you cross that limit a trap
> gets executed.  PPC, s390 and likely other ports have support for this.
> It should interoperate with -fstack-clash-protection and/or -fstack-check.

Yup.

> There's -fstack-check and -fstack-clash-protection.  I think with the
> direction we're going they are fundamentally incompatible because
> neither the compiler nor kernel do anything to guarantee enough stack is
> available after hitting the guard for -fstack-clash-protection.

Hrm, I have to think about that.

> And there's shrink wrapping.  This was originally raised by the aarch64
> guys and deserves some thought.  I'm particularly worried about aarch64
> if it was to shrink-wrap some particular register saves that we wanted
> to use as an implicit probe.

For normal shrink-wrapping, the volatile reg stores will always happen
in the same prologue that sets up the stack frame as well, so there
won't be any problem (except the usual stack ties you might need in the
prologue, etc.) (*)  For separate shrink-wrapping, yeah you'll need any
bb that could potentially store to the stack require the component for
the register you use as implicit probe.  This is pretty nasty.

> >> We created that alloca'd object at the wrong lexical scope which mucked
> >> up its expected persistence.  I'm sure I'd spot it trivially once I set
> >> up the test again.
> > 
> > That sounds like it might be a bug even.
> That was my thought as well.  It's c-torture/execute/20071029-1.c.

Ah cool, thanks for digging it up.


Segher


(*) Related, I know you don't want to scan generated code to see if
all probes are in place, but it seems you pretty much have to if you
want to make sure no later pass deletes the probes.  Well, something
for targets to worry about :-)


Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-07-13 Thread Jeff Law
On 06/28/2017 12:45 AM, Florian Weimer wrote:
> * Richard Earnshaw:
> 
>> I can't help but feel there's a bit of a goode olde mediaeval witch hunt
>> going on here.  As Wilco points out, we can never defend against a
>> function that is built without probe operations but skips the entire
>> guard zone.  The only defence there is a larger guard zone, but how big
>> do you make it?
> 
> Right.  And in the exploitable cases we have seen, there is a
> dynamically sized allocation which the attacker can influence, so it
> seems fairly likely that in a partially hardended binary, there could
> be another call stack which is exploitable, with a non-hardened
> function at the top.
> 
> I think a probing scheme which assumes that if the caller moves the
> stack pointer into more than half of the guard area, that's the
> callers fault would be totally appropriate in practice.  If possible,
> callee-only probing for its own stack usage is preferable, but not if
> it means instrumenting all functions which use the stack.
That position is a surprise Florian :-)  I would have expected a full
protection position, particularly after the discussions we've had about
noreturn functions.

I guess the difference in your position is driven by the relatively high
frequency of probing worst case assumptions are going to have on aarch64
with a relatively small vulnerability surface?   Which is a fairly stark
contrast to the noreturn situation where it rarely, if ever comes up in
practice and never on a hot path?

Jeff


Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Jeff Law
On 07/13/2017 04:48 PM, Segher Boessenkool wrote:
> On Thu, Jul 13, 2017 at 11:28:17AM -0600, Jeff Law wrote:
>> On 07/12/2017 04:44 PM, Segher Boessenkool wrote:
>>> On Tue, Jul 11, 2017 at 03:19:36PM -0600, Jeff Law wrote:
 Examples of implicit probes include
>>>
   2. ABI mandates that *sp always contain a backchain pointer (ppc)
>>>
>>> In the ELFv2 ABI a backchain is not required.  GCC still always has
>>> one afaik.  I'll find out more.
>> Please do.  I was under the impression it was mandated by the earlier
>> ABIs as well.  If it isn't, then I don't think we can depend on it for
>> the older ABIs.
> 
> I checked most ABIs, and all but ELFv2 require it.  You can assume we
> require it everywhere (we do assume it currently, and there is no
> intention to change this).  The statement in the ABI surprised me
> yesterday, sorry for panicking.
Y'all are the experts here.  It would be advisable to get the ABI
documents tweaked if indeed we are going to rely on the existence of the
backchain as an implicit probe.  Otherwise we end up in the same
scenario as aarch64 where we have to make some unpleasant assumptions.


> 
>> THe code we generate for alloca was so awful it's hard to see how
>> hitting each page once would matter either.  *However* I was looking at
>> x86 in this case and due to potential stack realignments x86's alloca
>> code might be notably worse than others for constant sizes.
> 
> There is generic code that aligns too often, too.  You might be seeing
> that same thing.
Exactly.  It's the generic code that's driven by various macros in the
x86 backend.


> 
>> There's further improvements that could be made as well.   It ought to
>> be possible to write an optimizer pass that uses some of the ideas from
>> DSE and SLSR to identify explicit probes that are made redundant by
>> nearby implicit probes -- this would seem most useful for the dynamic space.
>>
>> The problem is we'd want to do that in gimple, but probing of the
>> dynamic space happens at the gimple/rtl border.  So we'd probably want
>> to make probing happen earlier to expose stuff at the gimple level.
> 
> This would just get rid of one probe per dynamic allocation, correct?
> Doesn't seem worth complicating anything for.There's enough implicit probes 
> lying around in the IL that I suspect we
could likely prove the first and last are unnecessary on a reasonably
consistent basis.  It didn't seem critical to address at this stage, but
something we could look at later if we feel the need.

THe other thing I've pondered lightly would be to attach frame & probe
info to decl nodes, perhaps doing some IPA propagation.

THe idea here is if we have a function that is static to the CU, but its
not a good inline candidate, we can use information about the callers to
build a less pessimistic state at function entry.  This would likely
only help aarch64 and s390.  It would also fall into something we could
explore in the future if the need arises.


Thanks for all the feedback,
Jeff



Re: [PATCH], PR target/81193, Add warning for using __builtin_cpu_* on old PowerPC GLIBC's

2017-07-13 Thread Segher Boessenkool
On Thu, Jul 13, 2017 at 05:56:07PM -0400, Michael Meissner wrote:
> Given we have the macro __BUILTIN_CPU_SUPPORTS__, I could change the test to
> use that instead of a dg-requires and drop ppc_cpu_supports_hw_available.

Ah that would be nice, good idea!


Segher


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08

2017-07-13 Thread Jeff Law
On 07/12/2017 07:44 PM, Segher Boessenkool wrote:
> On Tue, Jul 11, 2017 at 03:20:12PM -0600, Jeff Law wrote:
>>  * conifg/mips/mips.c (mips_expand_prologue): Likewise.
> 
> Typo ("conifg").
Will fix.

> 
>> --- a/gcc/defaults.h
>> +++ b/gcc/defaults.h
>> @@ -1408,8 +1408,11 @@ see the files COPYING3 and COPYING.RUNTIME 
>> respectively.  If not, see
>>  #endif
>>  
>>  /* The default is not to move the stack pointer.  */
>> +/* The default is not to move the stack pointer, unless we are using
>> +   stack clash prevention stack checking.  */
>>  #ifndef STACK_CHECK_MOVING_SP
>> -#define STACK_CHECK_MOVING_SP 0
>> +#define STACK_CHECK_MOVING_SP\
>> +  (flag_stack_check == STACK_CLASH_BUILTIN_STACK_CHECK)
>>  #endif
> 
> Missing space before that backslash.
Similarly.

> 
> The documentation for STACK_CHECK_CONFIG_SP needs updating (its default
> is no longer zero, for one).
Yea.  Missed that.  I actually need to go back and look at this again.
I'm not entirely sure it's necessary -- it may be a relic from when I
thought more -fstack-check infrastructure was going to be reusable.


> 
> I don't really see why this is so complicated, and why the rs6000
> target changes (a later patch) are so big.  Why isn't it just simple
> patches to allocate_stack (and the prologue thing), that check the
> flag and if it is set do some probes?
Yea.  I wasn't happy with the size of the rs6000 patches either, which I
mentioned at some point :-)  Some of the complexity is making sure we
keep the backchain pointer correct and trying to do so as efficiently as
possible.  But there's too much conceptual code duplication.

Essentially the code shows up 3 times in slightly different forms.

Once when allocating space in the prologue.  Then again with the
probe_stack_range insn support, then again with the expander to allocate
dynamic stack space.

The prologue always has a known size and we should take advantage of
that, particularly since the prologue code is what needs to be the most
efficient.

The dynamic space allocation expander.  I think we should look at trying
to do less in the expander and rely more on refactor'd code from explow.

THe probe_stack_range output routine seems like it ought to be redundant
with something :-)

I'll look at this again and see if there's any good way to refactor and
simplify.

jeff


Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Segher Boessenkool
On Thu, Jul 13, 2017 at 11:28:17AM -0600, Jeff Law wrote:
> On 07/12/2017 04:44 PM, Segher Boessenkool wrote:
> > On Tue, Jul 11, 2017 at 03:19:36PM -0600, Jeff Law wrote:
> >> Examples of implicit probes include
> > 
> >>   2. ABI mandates that *sp always contain a backchain pointer (ppc)
> > 
> > In the ELFv2 ABI a backchain is not required.  GCC still always has
> > one afaik.  I'll find out more.
> Please do.  I was under the impression it was mandated by the earlier
> ABIs as well.  If it isn't, then I don't think we can depend on it for
> the older ABIs.

I checked most ABIs, and all but ELFv2 require it.  You can assume we
require it everywhere (we do assume it currently, and there is no
intention to change this).  The statement in the ABI surprised me
yesterday, sorry for panicking.

> THe code we generate for alloca was so awful it's hard to see how
> hitting each page once would matter either.  *However* I was looking at
> x86 in this case and due to potential stack realignments x86's alloca
> code might be notably worse than others for constant sizes.

There is generic code that aligns too often, too.  You might be seeing
that same thing.

> There's further improvements that could be made as well.   It ought to
> be possible to write an optimizer pass that uses some of the ideas from
> DSE and SLSR to identify explicit probes that are made redundant by
> nearby implicit probes -- this would seem most useful for the dynamic space.
> 
> The problem is we'd want to do that in gimple, but probing of the
> dynamic space happens at the gimple/rtl border.  So we'd probably want
> to make probing happen earlier to expose stuff at the gimple level.

This would just get rid of one probe per dynamic allocation, correct?
Doesn't seem worth complicating anything for.


Segher


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-13 Thread Jeff Law
On 07/13/2017 03:32 PM, Segher Boessenkool wrote:
> Hi again,
> 
> On Wed, Jul 12, 2017 at 09:56:09PM -0600, Jeff Law wrote:
 FWIW -fstack-check=specific is dreadfully named.  I haven't tried to
 address that.
>>>
>>> -fstack-check=clash is itself not such a super name either.  It's not
>>> checking stack, and it isn't clashing: it just does a store to every
>>> page the stack will touch (minus the first and last or something).
>> Yea.  I don't particularly like it either.  Suggestions?  I considered
>> "probe" as well, but "specific" also does probing.  In the end I used
>> the part of the marketing name of the exploits.
> 
> I don't think it should be inside -fstack-check at all.  Sure, the
> mechanisms implementing it overlap a bit (more on some targets, less
> on others), but how will a user ask for clash protection _and_ for
> stack checking?
The biggest problem with separating them is we would end up with a fair
amount code that ultimately looks like

if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
|| flag_whatever_the_new_thing_is)
  {
probe the stack
  }


We can certainly do that.  It'll touch a few more places (particularly
in backends that have checks for Ada stack overflow, but for which I
haven't written stack clash protection), but the changes would be simple
and repetitive.




> So something like -fstack-clash-protection, -fstack-touch-all-pages,
> together with whatever -fstack-check option is wanted (and yes the
> existing ones have very non-specific, non-descriptive, non-obvious names).
Given the heavy use of "stack clash" in the press I'd lean towards
-fstack-clash-protection.  THat makes it easier for folks to find the
right option to protect themselves.

>> Certainly open to ideas on the interface aspects.
> 
> The interface is much harder to change than any aspect of the GCC
> implementation.  Have to get it right at once, almost.
Yup.  It gets backed in even faster than normal in this case I suspect.
Red Hat will almost certainly pick up the bits and start backporting
them to RHEL 7, RHEL 6 and RHEL 5.  So the flag will be out in the wild
before gcc-8 hits the streets.

If you go back to my original message from several weeks ago, getting
the UI right and the basic concepts were my primary goals.  Details like
what probe instruction to use are, IMHO, much less important.


> 
>>> How does this work for targets that want to enable this by default?  How
>>> does that interact with checking for stack size overflow?
>> I don't currently have a way to enable it by default -- for my tests I
>> just slam the value I want into the initializer in common.opt :-)
>>
>> It's independent of stack size overflow checking.  They could (in
>> theory) even co-exist on ports that support stack size overflow
>> checking, but I didn't test that.
> 
> Okay, that was my impression as well.  But that interface won't allow
> it (unless every -fstack-check=X gets an -fstack-check=X+clash twin).
THere may have been some mis-communication here, but it's a good place
to spend some time thinking about interplay between options.

First, there's -fstack-limit-*.  Essentially there's an RTX which
defines the limit for the stack pointer.  If you cross that limit a trap
gets executed.  PPC, s390 and likely other ports have support for this.
It should interoperate with -fstack-clash-protection and/or -fstack-check.

Some ports have their own probing option.  x86 comes to mind.  The x86
uses that for stack probing on windows where extending the stack has to
happen explicitly (which is why they're not subject to stack-clash style
attacks).  The options probably shouldn't be mixed.


There's -fstack-check and -fstack-clash-protection.  I think with the
direction we're going they are fundamentally incompatible because
neither the compiler nor kernel do anything to guarantee enough stack is
available after hitting the guard for -fstack-clash-protection.

And there's shrink wrapping.  This was originally raised by the aarch64
guys and deserves some thought.  I'm particularly worried about aarch64
if it was to shrink-wrap some particular register saves that we wanted
to use as an implicit probe.

There may be others that I'm missing.



> 
>> We created that alloca'd object at the wrong lexical scope which mucked
>> up its expected persistence.  I'm sure I'd spot it trivially once I set
>> up the test again.
> 
> That sounds like it might be a bug even.
That was my thought as well.  It's c-torture/execute/20071029-1.c.


> 
/* Check the stack and entirely rely on the target configuration
 - files, i.e. do not use the generic mechanism at all.  */
 + files, i.e. do not use the generic mechanism at all.  This
 + does not prevent stack guard jumping and stack clash style
 + attacks.  */
FULL_BUILTIN_STACK_CHECK
  };
>>>
 +  else if (!strcmp (arg, "clash"))
 +  {
 +/* This is the stack checking method, designed to prevent
 +   

[PATCH] scheduler bug fix for AArch64 insn fusing SCHED_GROUP usage

2017-07-13 Thread Jim Wilson
The AArch64 port uses SCHED_GROUP to mark instructions that get fused
at issue time, to ensure that they will be issued together.  However,
in the scheduler, use of a SCHED_GROUP forces all other instructions
to issue in the next cycle.  This is wrong for AArch64 ports using
insn fusing which can issue multiple insns per cycle, as aarch64
SCHED_GROUP insns can all issue in the same cycle, and other insns can
issue in the same cycle also.

I put a testcase and some info in bug 81434.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81434

The attached patch fixes the problem.  The behavior in pass == 0 is
same as now.  All non sched group insns are ignored, and all sched
group insns are checked to see if they need to be queued for a latter
cycle.  The difference is in the second pass where non sched group
insns are queued for a latter cycle only if there is a sched group
insn that got queued.  Since sched group insns always sort to the top
of the list of insns to schedule, all sched group insns still get
scheduled together as before.

This has been tested with an Aarch64 bootstrap and make check.

OK?

Jim
2017-07-13  Jim Wilson  

	PR rtl-optimization/81434
	* haifa-sched.c (prune_ready_list): Init min_cost_group to 0.  Update
	comment for main loop.  In sched_group_found if, also add checks for
	pass and min_cost_group.

diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 1b13e32..f6369d9 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -6300,7 +6300,7 @@ prune_ready_list (state_t temp_state, bool first_cycle_insn_p,
 {
   int i, pass;
   bool sched_group_found = false;
-  int min_cost_group = 1;
+  int min_cost_group = 0;
 
   if (sched_fusion)
 return;
@@ -6316,8 +6316,8 @@ prune_ready_list (state_t temp_state, bool first_cycle_insn_p,
 }
 
   /* Make two passes if there's a SCHED_GROUP_P insn; make sure to handle
- such an insn first and note its cost, then schedule all other insns
- for one cycle later.  */
+ such an insn first and note its cost.  If at least one SCHED_GROUP_P insn
+ gets queued, then all other insns get queued for one cycle later.  */
   for (pass = sched_group_found ? 0 : 1; pass < 2; )
 {
   int n = ready.n_ready;
@@ -6330,7 +6330,8 @@ prune_ready_list (state_t temp_state, bool first_cycle_insn_p,
 	  if (DEBUG_INSN_P (insn))
 	continue;
 
-	  if (sched_group_found && !SCHED_GROUP_P (insn))
+	  if (sched_group_found && !SCHED_GROUP_P (insn)
+	  && ((pass == 0) || (min_cost_group >= 1)))
 	{
 	  if (pass == 0)
 		continue;


Re: [PATCH], PR target/81193, Add warning for using __builtin_cpu_* on old PowerPC GLIBC's

2017-07-13 Thread Michael Meissner
On Thu, Jul 13, 2017 at 03:57:08PM -0500, Segher Boessenkool wrote:
> On Wed, Jul 12, 2017 at 07:19:27PM -0400, Michael Meissner wrote:
> > Hmmm, I didn't realize that gcc 6.x also supported __builtin_cpu_*.  I 
> > imagine
> > we will need backports there as well.
> 
> Okay for 6 too if needed there (do testcases warn us for that?)

I have patches for both 6 and 7.  I had to back port the change in
testsuite/lib/target-supports.exp for ppc_cpu_supports_hw_available in order to
add the dg-requires for the 1 test that is in the testsuite.

The part of the change that changed the target_clones warning to error is not
appropriate for either branch, and that patch has been dropped.  The GCC 6
patch for defining __BUILTIN_CPU_SUPPORTS__ had to be moved since the
preceeding lines aren't in GCC 6.

Given we have the macro __BUILTIN_CPU_SUPPORTS__, I could change the test to
use that instead of a dg-requires and drop ppc_cpu_supports_hw_available.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 250169)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -648,6 +648,9 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
 builtin_define ("__FLOAT128_HARDWARE__");
   if (TARGET_LONG_DOUBLE_128 && FLOAT128_IBM_P (TFmode))
 builtin_define ("__ibm128=long double");
+#ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
+  builtin_define ("__BUILTIN_CPU_SUPPORTS__");
+#endif
 
   /* We needed to create a keyword if -mfloat128-type was used but not -mfloat,
  so we used __ieee128.  If -mfloat128 was used, create a #define back to
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 250169)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -15584,6 +15584,8 @@ cpu_expand_builtin (enum rs6000_builtins
   emit_insn (gen_eqsi3 (scratch2, scratch1, const0_rtx));
   emit_insn (gen_rtx_SET (target, gen_rtx_XOR (SImode, scratch2, 
const1_rtx)));
 }
+  else
+gcc_unreachable ();
 
   /* Record that we have expanded a CPU builtin, so that we can later
  emit a reference to the special symbol exported by LIBC to ensure we
@@ -15591,6 +15593,9 @@ cpu_expand_builtin (enum rs6000_builtins
   cpu_builtin_p = true;
 
 #else
+  warning (0, "%s needs GLIBC (2.23 and newer) that exports hardware "
+  "capability bits", rs6000_builtin_info[(size_t) fcode].name);
+  
   /* For old LIBCs, always return FALSE.  */
   emit_move_insn (target, GEN_INT (0));
 #endif /* TARGET_LIBC_PROVIDES_HWCAP_IN_TCB */
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi (revision 250169)
+++ gcc/doc/extend.texi (working copy)
@@ -14894,10 +14894,25 @@ This function is a @code{nop} on the Pow
 to maintain API compatibility with the x86 builtins.
 @end deftypefn
 
+@deftypefn {Built-in Function} void __builtin_cpu_init (void)
+This function is a @code{nop} on the PowerPC platform and is included solely
+to maintain API compatibility with the x86 builtins.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_cpu_is (const char *@var{cpuname})
 This function returns a value of @code{1} if the run-time CPU is of type
-@var{cpuname} and returns @code{0} otherwise. The following CPU names can be
-detected:
+@var{cpuname} and returns @code{0} otherwise
+
+The @code{__builtin_cpu_is} function requires GLIBC 2.23 or newer
+which exports the hardware capability bits.  GCC defines the macro
+@code{__BUILTIN_CPU_SUPPORTS__} if the @code{__builtin_cpu_supports}
+built-in function is fully supported.
+
+If GCC was configured to use a GLIBC before 2.23, the built-in
+function @code{__builtin_cpu_is} always returns a 0 and the compiler
+issues a warning.
+
+The following CPU names can be detected:
 
 @table @samp
 @item power9
@@ -14934,20 +14949,33 @@ IBM PowerPC Cell Broadband Engine Archit
 
 Here is an example:
 @smallexample
-if (__builtin_cpu_is ("power8"))
-  @{
- do_power8 (); // POWER8 specific implementation.
-  @}
-else
-  @{
- do_generic (); // Generic implementation.
-  @}
+#ifdef __BUILTIN_CPU_SUPPORTS__
+  if (__builtin_cpu_is ("power8"))
+@{
+   do_power8 (); // POWER8 specific implementation.
+@}
+  else
+#endif
+@{
+   do_generic (); // Generic implementation.
+@}
 @end smallexample
 @end deftypefn
 
 @deftypefn {Built-in Function} int __builtin_cpu_supports (const char 
*@var{feature})
 This function returns a value of @code{1} if the run-time CPU supports the 
HWCAP
-feature @var{feature} and returns @code{0} otherwise. The following features 
can be
+feature @var{feature} and returns @code{0} otherwise.
+
+The @code{__builtin_cpu_supports} function requires GLIBC 2.23 or
+newer which 

Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-13 Thread Segher Boessenkool
Hi again,

On Wed, Jul 12, 2017 at 09:56:09PM -0600, Jeff Law wrote:
> >> FWIW -fstack-check=specific is dreadfully named.  I haven't tried to
> >> address that.
> > 
> > -fstack-check=clash is itself not such a super name either.  It's not
> > checking stack, and it isn't clashing: it just does a store to every
> > page the stack will touch (minus the first and last or something).
> Yea.  I don't particularly like it either.  Suggestions?  I considered
> "probe" as well, but "specific" also does probing.  In the end I used
> the part of the marketing name of the exploits.

I don't think it should be inside -fstack-check at all.  Sure, the
mechanisms implementing it overlap a bit (more on some targets, less
on others), but how will a user ask for clash protection _and_ for
stack checking?

So something like -fstack-clash-protection, -fstack-touch-all-pages,
together with whatever -fstack-check option is wanted (and yes the
existing ones have very non-specific, non-descriptive, non-obvious names).

> 1. We never probe ahead of need.  ie, if a function requests N bytes of
> stack space, then we'll never probe beyond N bytes.  In contrast
> -fstack-check=specific will tend to probe 2-3 pages beyond the N byte
> request.

s/need/immediate need/.  Nod.

> 2. We probe as each page is allocated.  in contrast most
> -fstack-check=specific implementations allocate all the space, then
> probe into the space.

Right, and that leaves some dangerous openings for exploitation.

> Certainly open to ideas on the interface aspects.

The interface is much harder to change than any aspect of the GCC
implementation.  Have to get it right at once, almost.

> > How does this work for targets that want to enable this by default?  How
> > does that interact with checking for stack size overflow?
> I don't currently have a way to enable it by default -- for my tests I
> just slam the value I want into the initializer in common.opt :-)
> 
> It's independent of stack size overflow checking.  They could (in
> theory) even co-exist on ports that support stack size overflow
> checking, but I didn't test that.

Okay, that was my impression as well.  But that interface won't allow
it (unless every -fstack-check=X gets an -fstack-check=X+clash twin).

> We created that alloca'd object at the wrong lexical scope which mucked
> up its expected persistence.  I'm sure I'd spot it trivially once I set
> up the test again.

That sounds like it might be a bug even.

> >>/* Check the stack and entirely rely on the target configuration
> >> - files, i.e. do not use the generic mechanism at all.  */
> >> + files, i.e. do not use the generic mechanism at all.  This
> >> + does not prevent stack guard jumping and stack clash style
> >> + attacks.  */
> >>FULL_BUILTIN_STACK_CHECK
> >>  };
> > 
> >> +  else if (!strcmp (arg, "clash"))
> >> +  {
> >> +/* This is the stack checking method, designed to prevent
> >> +   stack-clash attacks.  */
> >> +if (!STACK_GROWS_DOWNWARD)
> >> +  sorry ("-fstack-check=clash not implemented on this target");
> >> +else
> >> +  opts->x_flag_stack_check = (STACK_CHECK_BUILTIN
> >> +  ? FULL_BUILTIN_STACK_CHECK
> >> +  : (STACK_CHECK_STATIC_BUILTIN
> >> + ? STACK_CLASH_BUILTIN_STACK_CHECK
> >> + : GENERIC_STACK_CHECK));
> >> +  }
> > 
> > So targets that define STACK_CHECK_BUILTIN (spu and alpha) do not get
> > stack clash protection if you asked for it specifically, without warning,
> > if I read that correctly?
> That's an unknown.  I'd have to dig into the guts of the alpha and spu
> to understand precisely how their STACK_CHECK_BUILTIN works.

Both just define it to 1.  So this code then specialises to

  else if (!strcmp (arg, "clash"))
{
  opts->x_flag_stack_check = FULL_BUILTIN_STACK_CHECK;
}

which it says above does *not* protect against stack clash attacks.
Which seems backward.

> >> +proc check_effective_target_stack_clash_protected { } {
> > 
> > The name is maybe not so great: nothing is protected until you actually
> > use the option.  "supported", maybe?
> I hate all the names...  "supports_stack_clash_protection" perhaps?

Works for me!

> >> +# Return 1 if the target's calling sequence or its ABI
> >> +# create implicit stack probes at *sp at function
> >> +# entry.
> >> +proc check_effective_target_caller_implicit_probes { } {
> > 
> > "At function entry" isn't really true for Power ("when setting up a
> > stack frame", instead -- and you are required to set one up before
> > calling anything).
> I think it's close enough -- I'll ponder a better name.  s390x doesn't
> really have caller implicit probes either, but stack saves in the callee
> act like them (because the caller allocates the space for the callee's
> save area).

Oh the name is fine, but maybe expand the comment a bit.  Or even 

Re: [BUILDROBOT] RISC-V: ‘profile_probability’ has not been declared

2017-07-13 Thread Jan-Benedict Glaw
Hi Jeff!

On Thu, 2017-07-13 14:43:52 -0600, Jeff Law  wrote:
> On 07/13/2017 02:39 PM, Jan-Benedict Glaw wrote:
> > On Thu, 2017-06-29 14:27:41 +0200, Jan Hubicka  wrote:
> >> this is second step of the profile maintenance revamp.  It implements
> >> profile_probability type which is pretty much symmetric to profile_count
> >> except that it implements fixed point arithmetics for values 0...1.
> >> It is used to maintain probabilities of edges out of basic blocks.
> >> In addition it tracks information about quality of the estimate which can
> >> later be used for optimization.
> > 
> > RISC-V (--enable-languages=c,c++ --target=riscv32-unknown-linux-gnu
> > --without-headers --disable-threads) fails to build right now,
> > probably only missing header or header order:
> [ ... ]
> I fixed that yesterday :-)

Took me a bit to look through the logs.  Great! to see you're fixing
faster than I'd search for these small break-ups. :)

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Alles wird gut! ...und heute wirds schon ein bißchen 
besser.
the second  :


signature.asc
Description: Digital signature


Re: c-family PATCH to improve -Wsign-compare (PR c/81417)

2017-07-13 Thread David Malcolm
On Thu, 2017-07-13 at 16:39 -0400, Eric Gallager wrote:
> On 7/13/17, David Malcolm  wrote:
> > On Thu, 2017-07-13 at 18:33 +0200, Marek Polacek wrote:
> > > A tiny patch for -Wsign-compare so that is also prints the types
> > > when
> > > reporting a warning.
> > > 
> > > Bootstrapped/regtested on x86_64-linux and ppc64le-redhat-linux,
> > > ok
> > > for trunk?
> > 
> > Looks like it always display the types in the order signed then
> > unsigned, which matches the text of the diagnostic, but not
> > necessarily
> > the ordering within the expression, which might be confusing if
> > someone's comparing e.g.
> > 
> >   unsigned_a < signed_b
> > 
> 
> Good catch, I forgot about that case when opening the original bug
> that Marek posted this patch for...
> 
> > But we already hardcode the ordering within the text of the
> > diagnostic,
> > so that feels excessively nit-picky.
> 
> I don't think it's being excessively nit-picky; I think it'd make
> more
> sense to match the ordering of the expression. That's what clang
> does:
> 
> $ cat Wsign_compare.c
> /* { dg-do compile } */
> 
> int foo(signed int a, unsigned int b)
> {
>   return (a < b);
> }
> 
> int bar(unsigned int c, signed int d)
> {
>   return (c < d);
> }
> 
> $ /sw/opt/llvm-3.1/bin/clang -c -Wsign-compare Wsign_compare.c
> Wsign_compare.c:5:12: warning: comparison of integers of different
> signs: 'int' and 'unsigned int' [-Wsign-compare]
> return (a < b);
> ~ ^ ~
> Wsign_compare.c:10:12: warning: comparison of integers of different
> signs: 'unsigned int' and 'int' [-Wsign-compare]
> return (c < d);
> ~ ^ ~
> 2 warnings generated.

That's much nicer.

> 
> > 
> > OK for trunk (with my "diagnostic messages" maintainer hat on).

Marek: I take it back; can you update the patch accordingly, please?

(Note to self: always doublecheck the linked PR for context).



Re: [PATCH], PR target/81193, Add warning for using __builtin_cpu_* on old PowerPC GLIBC's

2017-07-13 Thread Segher Boessenkool
On Wed, Jul 12, 2017 at 07:19:27PM -0400, Michael Meissner wrote:
> Hmmm, I didn't realize that gcc 6.x also supported __builtin_cpu_*.  I imagine
> we will need backports there as well.

Okay for 6 too if needed there (do testcases warn us for that?)

Thanks,


Segher


Re: [PATCH, rs6000] Add support for vec_extract_fp_from_shorth() and vec_extract_fp_from_short

2017-07-13 Thread Segher Boessenkool
Hi Carl,

On Wed, Jul 12, 2017 at 04:08:20PM -0700, Carl Love wrote:
>   * config/rs6000/vsx.md(vsx_xvcvhpsp): Add define_insn.

Space before (.

> +;; Generate xvcvhpsp instruction
> +(define_insn "vsx_xvcvhpsp"
> +  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
> + (unspec:V4SF [(match_operand: V8HI 1 "vsx_register_operand" "f")]
> +  UNSPEC_VSX_CVHPSP))]
> +  "VECTOR_UNIT_VSX_P (V4SFmode)"
> +  "xvcvhpsp %x0,%x1"
> +  [(set_attr "type" "fp")])

Is there anything here that restricts this to ISA 3.0 and later?
Directly, I mean; nothing will generate this otherwise, but that is
kind of fragile.

Why "f" as contraint?  Won't this work on any VSX register?

Is type "fp" a good type for this (for p9 scheduling)?

> +;; Generate vector extract four float 32 values from left four elements
> +;; of eight element vector of float 16 values.
> +(define_expand "vextract_fp_from_shorth"
> +  [(set (match_operand:V4SF 0 "register_operand" "=v")
> + (unspec:V4SF [(match_operand:V8HI 1 "register_operand" "v")]
> + UNSPEC_VSX_VEXTRACT_FP_FROM_SHORTH))]
> +  "TARGET_P9_VECTOR"
> +{
> +  int vals[16] = {0, 1, 0 ,0, 2, 3, 0, 0, 4, 5, 0, 0, 6, 7, 0, 8};

s/ ,/, /

Is that 8 correct, shouldn't it be 0?

Is "v" best (won't "wa" work)?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target { powerpc64*-*-* && { lp64 && p9vector_hw } } } } */
> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
> "-mcpu=power9" } } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */

Do you need this although you have p9vector_hw already?


Segher


[PATCH] Improve bswap on nop non-base_addr reshuffles (PR tree-optimization/81396)

2017-07-13 Thread Jakub Jelinek
Hi!

As mentioned in the PR, the following testcase started using recently
BIT_FIELD_REFs instead of MEM_REFs and thus the bswap pass, while it
properly determines the very long sequence of stmts is a nop transformation,
throws that away and doesn't optimize it, and no other optimizations
are able to optimize it away.

The patch attempts to not do anything if there is a simple identity
copy, but if the nop reshuffle needs more than one operation, it will
try to replace the final SSA_NAME BIT_IOR_EXPR assignment with assignment
from the src value (typically SSA_NAME).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-07-13  Jakub Jelinek  

PR tree-optimization/81396
* tree-ssa-math-opts.c (struct symbolic_number): Add n_ops field.
(init_symbolic_number): Initialize it to 1.
(perform_symbolic_merge): Add n_ops from both operands into the new
n_ops.
(find_bswap_or_nop): Don't consider n->n == cmpnop computations
without base_addr as useless if they need more than one operation.
(bswap_replace): Handle !bswap case for NULL base_addr.

* gcc.dg/tree-ssa/pr81396.c: New test.

--- gcc/tree-ssa-math-opts.c.jj 2017-07-06 20:31:43.0 +0200
+++ gcc/tree-ssa-math-opts.c2017-07-13 19:27:02.985354778 +0200
@@ -1968,6 +1968,7 @@ struct symbolic_number {
   tree alias_set;
   tree vuse;
   unsigned HOST_WIDE_INT range;
+  int n_ops;
 };
 
 #define BITS_PER_MARKER 8
@@ -2083,6 +2084,7 @@ init_symbolic_number (struct symbolic_nu
 return false;
   n->range = size;
   n->n = CMPNOP;
+  n->n_ops = 1;
 
   if (size < 64 / BITS_PER_MARKER)
 n->n &= ((uint64_t) 1 << (size * BITS_PER_MARKER)) - 1;
@@ -2293,6 +2295,7 @@ perform_symbolic_merge (gimple *source_s
return NULL;
 }
   n->n = n1->n | n2->n;
+  n->n_ops = n1->n_ops + n2->n_ops;
 
   return source_stmt;
 }
@@ -2588,7 +2591,7 @@ find_bswap_or_nop (gimple *stmt, struct
 return NULL;
 
   /* Useless bit manipulation performed by code.  */
-  if (!n->base_addr && n->n == cmpnop)
+  if (!n->base_addr && n->n == cmpnop && n->n_ops == 1)
 return NULL;
 
   n->range *= BITS_PER_UNIT;
@@ -2747,6 +2750,36 @@ bswap_replace (gimple *cur_stmt, gimple
}
   src = val_tmp;
 }
+  else if (!bswap)
+{
+  gimple *g;
+  if (!useless_type_conversion_p (TREE_TYPE (tgt), TREE_TYPE (src)))
+   {
+ if (!is_gimple_val (src))
+   return false;
+ g = gimple_build_assign (tgt, NOP_EXPR, src);
+   }
+  else
+   g = gimple_build_assign (tgt, src);
+  if (n->range == 16)
+   nop_stats.found_16bit++;
+  else if (n->range == 32)
+   nop_stats.found_32bit++;
+  else
+   {
+ gcc_assert (n->range == 64);
+ nop_stats.found_64bit++;
+   }
+  if (dump_file)
+   {
+ fprintf (dump_file,
+  "%d bit reshuffle in target endianness found at: ",
+  (int) n->range);
+ print_gimple_stmt (dump_file, cur_stmt, 0);
+   }
+  gsi_replace (, g, true);
+  return true;
+}
   else if (TREE_CODE (src) == BIT_FIELD_REF)
 src = TREE_OPERAND (src, 0);
 
--- gcc/testsuite/gcc.dg/tree-ssa/pr81396.c.jj  2017-07-13 19:22:10.191954620 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr81396.c 2017-07-13 19:24:16.638399984 
+0200
@@ -0,0 +1,25 @@
+/* PR tree-optimization/81396 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+typedef unsigned long long uint64_t;
+
+uint64_t
+foo (uint64_t word)
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ && __SIZEOF_LONG_LONG__ == 8
+  const unsigned char *const ptr = (const unsigned char *) 
+  return ((uint64_t) ptr[0]
+ | ((uint64_t) ptr[1] << 8)
+ | ((uint64_t) ptr[2] << (8 * 2))
+ | ((uint64_t) ptr[3] << (8 * 3))
+ | ((uint64_t) ptr[4] << (8 * 4))
+ | ((uint64_t) ptr[5] << (8 * 5))
+ | ((uint64_t) ptr[6] << (8 * 6))
+ | ((uint64_t) ptr[7] << (8 * 7)));
+#else
+  return word;
+#endif
+}
+
+/* { dg-final { scan-tree-dump "return word_\[0-9]*\\(D\\);" "optimized" } } */

Jakub


[PATCH] Misc libquadmath backports from upstram glibc (PR libquadmath/65757)

2017-07-13 Thread Jakub Jelinek
Hi!

This patch is a manual backport of the 2012-2017 sysdeps/ieee754/ldbl-128/
glibc changes into libquadmath.  As mentioned in the PR, which has
detailed git diff commands, I've left *jnl.c, *lgamma* and x2y2m1l.c
changes so far, those were too large, and the long double -> _Float128
and 123.456L -> L(123.456) changes.

Bootstrapped/regtested on x86_64-linux and i686-linux, no further testing
done though.  I'll commit this next week unless I hear objections.
Of course, further testing e.g. using glibc testsuite would be greatly
appreciated.

2017-07-13  Jakub Jelinek  

PR libquadmath/65757
* quadmath-imp.h (math_opt_barrier, math_force_eval,
math_narrow_eval, math_check_force_underflow,
math_check_force_underflow_nonneg): Define.
* math/ceilq.c: Backport changes from upstream glibc
between 2012-11-01 and 2017-07-13.
* math/remquoq.c: Likewise.
* math/expq.c: Likewise.
* math/llroundq.c: Likewise.
* math/logq.c: Likewise.
* math/atanq.c: Likewise.
* math/nearbyintq.c: Likewise.
* math/scalblnq.c: Likewise.
* math/finiteq.c: Likewise.
* math/atanhq.c: Likewise.
* math/expm1q.c: Likewise.
* math/sinhq.c: Likewise.
* math/log10q.c: Likewise.
* math/rintq.c: Likewise.
* math/roundq.c: Likewise.
* math/fmaq.c: Likewise.
* math/erfq.c: Likewise.
* math/log2q.c: Likewise.
* math/lroundq.c: Likewise.
* math/j1q.c: Likewise.
* math/scalbnq.c: Likewise.
* math/truncq.c: Likewise.
* math/frexpq.c: Likewise.
* math/sincosq.c: Likewise.
* math/tanhq.c: Likewise.
* math/asinq.c: Likewise.
* math/coshq.c: Likewise.
* math/j0q.c: Likewise.
* math/asinhq.c: Likewise.
* math/floorq.c: Likewise.
* math/sinq_kernel.c: Likewise.
* math/powq.c: Likewise.
* math/hypotq.c: Likewise.
* math/sincos_table.c: Likewise.
* math/rem_pio2q.c: Likewise.
* math/nextafterq.c: Likewise.
* math/log1pq.c: Likewise.
* math/sincosq_kernel.c: Likewise.
* math/tanq.c: Likewise.
* math/acosq.c: Likewise.
* math/lrintq.c: Likewise.
* math/llrintq.c: Likewise.

--- libquadmath/quadmath-imp.h.jj   2013-02-07 08:59:55.0 +0100
+++ libquadmath/quadmath-imp.h  2017-07-13 13:04:21.928641461 +0200
@@ -186,4 +186,45 @@ do {   \
   __builtin_fpclassify (QUADFP_NAN, QUADFP_INFINITE, QUADFP_NORMAL, \
QUADFP_SUBNORMAL, QUADFP_ZERO, x)
 
+#ifndef math_opt_barrier
+# define math_opt_barrier(x) \
+({ __typeof (x) __x = (x); __asm ("" : "+m" (__x)); __x; })
+# define math_force_eval(x) \
+({ __typeof (x) __x = (x); __asm __volatile__ ("" : : "m" (__x)); })
+#endif
+
+/* math_narrow_eval reduces its floating-point argument to the range
+   and precision of its semantic type.  (The original evaluation may
+   still occur with excess range and precision, so the result may be
+   affected by double rounding.)  */
+#define math_narrow_eval(x) (x)
+
+/* If X (which is not a NaN) is subnormal, force an underflow
+   exception.  */
+#define math_check_force_underflow(x)  \
+  do   \
+{  \
+  __float128 force_underflow_tmp = (x);\
+  if (fabsq (force_underflow_tmp) < FLT128_MIN)\
+   {   \
+ __float128 force_underflow_tmp2   \
+   = force_underflow_tmp * force_underflow_tmp;\
+ math_force_eval (force_underflow_tmp2);   \
+   }   \
+}  \
+  while (0)
+/* Likewise, but X is also known to be nonnegative.  */
+#define math_check_force_underflow_nonneg(x)   \
+  do   \
+{  \
+  __float128 force_underflow_tmp = (x);\
+  if (force_underflow_tmp < FLT128_MIN)\
+   {   \
+ __float128 force_underflow_tmp2   \
+   = force_underflow_tmp * force_underflow_tmp;\
+ math_force_eval (force_underflow_tmp2);   \
+   }   \
+}  \
+  while (0)
+
 #endif
--- libquadmath/math/ceilq.c.jj 2012-11-02 09:01:48.0 +0100
+++ libquadmath/math/ceilq.c2017-07-13 13:04:21.932641408 +0200
@@ -15,8 +15,6 @@
 
 #include 

Re: [BUILDROBOT] RISC-V: ‘profile_probability’ has not been declared

2017-07-13 Thread Palmer Dabbelt
On Thu, 13 Jul 2017 13:43:52 PDT (-0700), l...@redhat.com wrote:
> On 07/13/2017 02:39 PM, Jan-Benedict Glaw wrote:
>> Hi Jan,
>> hi Kito, Palmer and Andrew!
>>
>> On Thu, 2017-06-29 14:27:41 +0200, Jan Hubicka  wrote:
>>> this is second step of the profile maintenance revamp.  It implements
>>> profile_probability type which is pretty much symmetric to profile_count
>>> except that it implements fixed point arithmetics for values 0...1.
>>> It is used to maintain probabilities of edges out of basic blocks.
>>> In addition it tracks information about quality of the estimate which can
>>> later be used for optimization.
>>
>> RISC-V (--enable-languages=c,c++ --target=riscv32-unknown-linux-gnu
>> --without-headers --disable-threads) fails to build right now,
>> probably only missing header or header order:
> [ ... ]
> I fixed that yesterday :-)

Thanks!  I was just cloning a fresh version to try and reproduce it.


Re: [BUILDROBOT] RISC-V: ‘profile_probability’ has not been declared

2017-07-13 Thread Jeff Law
On 07/13/2017 02:39 PM, Jan-Benedict Glaw wrote:
> Hi Jan,
> hi Kito, Palmer and Andrew!
> 
> On Thu, 2017-06-29 14:27:41 +0200, Jan Hubicka  wrote:
>> this is second step of the profile maintenance revamp.  It implements
>> profile_probability type which is pretty much symmetric to profile_count
>> except that it implements fixed point arithmetics for values 0...1.
>> It is used to maintain probabilities of edges out of basic blocks.
>> In addition it tracks information about quality of the estimate which can
>> later be used for optimization.
> 
> RISC-V (--enable-languages=c,c++ --target=riscv32-unknown-linux-gnu
> --without-headers --disable-threads) fails to build right now,
> probably only missing header or header order:
[ ... ]
I fixed that yesterday :-)

jeff



signature.asc
Description: OpenPGP digital signature


[PATCH] Fix ICE on _Fract division (PR tree-optimization/81428)

2017-07-13 Thread Jakub Jelinek
Hi!

_Fract types can't express 1, other spots that call build_one_cst already
make sure that the type is integral or uses the !ALL_FRACT_MODE_P (TYPE_MODE 
(type))
check I've added in this patch.

Bootstrapped/regtested on x86_64-linux and i686-linux and tested with a
cross to arm on the testcase.  Ok for trunk?

2017-07-13  Jakub Jelinek  

PR tree-optimization/81428
* match.pd (X / X -> one): Don't optimize _Fract divisions, as 1
can't be built for those types.

* gcc.dg/fixed-point/pr81428.c: New test.

--- gcc/match.pd.jj 2017-07-13 15:37:34.0 +0200
+++ gcc/match.pd2017-07-13 15:46:11.194593051 +0200
@@ -243,8 +243,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  /* X / X is one.  */
  (simplify
   (div @0 @0)
-  /* But not for 0 / 0 so that we can get the proper warnings and errors.  */
-  (if (!integer_zerop (@0))
+  /* But not for 0 / 0 so that we can get the proper warnings and errors.
+ And not for _Fract types where we can't build 1.  */
+  (if (!integer_zerop (@0) && !ALL_FRACT_MODE_P (TYPE_MODE (type)))
{ build_one_cst (type); }))
  /* X / abs (X) is X < 0 ? -1 : 1.  */ 
  (simplify
--- gcc/testsuite/gcc.dg/fixed-point/pr81428.c.jj   2017-07-13 
15:49:52.980806440 +0200
+++ gcc/testsuite/gcc.dg/fixed-point/pr81428.c  2017-07-13 15:49:29.0 
+0200
@@ -0,0 +1,9 @@
+/* PR tree-optimization/81428 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+foo (long _Fract *a, long _Fract *b)
+{
+  *b = *a / *a;
+}

Jakub


Re: c-family PATCH to improve -Wsign-compare (PR c/81417)

2017-07-13 Thread Eric Gallager
On 7/13/17, David Malcolm  wrote:
> On Thu, 2017-07-13 at 18:33 +0200, Marek Polacek wrote:
>> A tiny patch for -Wsign-compare so that is also prints the types when
>> reporting a warning.
>>
>> Bootstrapped/regtested on x86_64-linux and ppc64le-redhat-linux, ok
>> for trunk?
>
> Looks like it always display the types in the order signed then
> unsigned, which matches the text of the diagnostic, but not necessarily
> the ordering within the expression, which might be confusing if
> someone's comparing e.g.
>
>   unsigned_a < signed_b
>

Good catch, I forgot about that case when opening the original bug
that Marek posted this patch for...

> But we already hardcode the ordering within the text of the diagnostic,
> so that feels excessively nit-picky.

I don't think it's being excessively nit-picky; I think it'd make more
sense to match the ordering of the expression. That's what clang does:

$ cat Wsign_compare.c
/* { dg-do compile } */

int foo(signed int a, unsigned int b)
{
return (a < b);
}

int bar(unsigned int c, signed int d)
{
return (c < d);
}

$ /sw/opt/llvm-3.1/bin/clang -c -Wsign-compare Wsign_compare.c
Wsign_compare.c:5:12: warning: comparison of integers of different
signs: 'int' and 'unsigned int' [-Wsign-compare]
return (a < b);
~ ^ ~
Wsign_compare.c:10:12: warning: comparison of integers of different
signs: 'unsigned int' and 'int' [-Wsign-compare]
return (c < d);
~ ^ ~
2 warnings generated.


>
> OK for trunk (with my "diagnostic messages" maintainer hat on).
>
> Thanks
> Dave
>
>> 2017-07-13  Marek Polacek  
>>
>>  PR c/81417
>>  * c-warn.c (warn_for_sign_compare): Print the types.
>>
>>  * c-c++-common/Wsign-compare-1.c: New test.
>>
>> diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
>> index b9378c2dbe2..c903c080a33 100644
>> --- gcc/c-family/c-warn.c
>> +++ gcc/c-family/c-warn.c
>> @@ -1891,9 +1891,10 @@ warn_for_sign_compare (location_t location,
>> c_common_signed_type
>> (base_type)))
>>  /* OK */;
>>else
>> -warning_at (location,
>> -OPT_Wsign_compare,
>> -"comparison between signed and unsigned integer
>> expressions");
>> +warning_at (location, OPT_Wsign_compare,
>> +"comparison between signed and unsigned integer
>> "
>> +"expressions: %qT and %qT", TREE_TYPE (sop),
>> +TREE_TYPE (uop));
>>  }
>>
>>/* Warn if two unsigned values are being compared in a size larger
>> diff --git gcc/testsuite/c-c++-common/Wsign-compare-1.c
>> gcc/testsuite/c-c++-common/Wsign-compare-1.c
>> index e69de29bb2d..e53f87aa9a3 100644
>> --- gcc/testsuite/c-c++-common/Wsign-compare-1.c
>> +++ gcc/testsuite/c-c++-common/Wsign-compare-1.c
>> @@ -0,0 +1,27 @@
>> +/* PR c/81417 */
>> +/* { dg-do compile } */
>> +/* { dg-options "-Wsign-compare" } */
>> +
>> +int
>> +fn1 (signed int a, unsigned int b)
>> +{
>> +  return a < b; /* { dg-warning "comparison between signed and
>> unsigned integer expressions: 'int' and 'unsigned int'" } */
>> +}
>> +
>> +int
>> +fn2 (signed int a, unsigned int b)
>> +{
>> +  return b < a; /* { dg-warning "comparison between signed and
>> unsigned integer expressions: 'int' and 'unsigned int'" } */
>> +}
>> +
>> +int
>> +fn3 (signed long int a, unsigned long int b)
>> +{
>> +  return b < a; /* { dg-warning "comparison between signed and
>> unsigned integer expressions: 'long int' and 'long unsigned int'" }
>> */
>> +}
>> +
>> +int
>> +fn4 (signed short int a, unsigned int b)
>> +{
>> +  return b < a; /* { dg-warning "comparison between signed and
>> unsigned integer expressions: 'short int' and 'unsigned int'" } */
>> +}
>>
>>  Marek
>


[BUILDROBOT] RISC-V: ‘profile_probability’ has not been declared (was: Convert profile probabilities to new type)

2017-07-13 Thread Jan-Benedict Glaw
Hi Jan,
hi Kito, Palmer and Andrew!

On Thu, 2017-06-29 14:27:41 +0200, Jan Hubicka  wrote:
> this is second step of the profile maintenance revamp.  It implements
> profile_probability type which is pretty much symmetric to profile_count
> except that it implements fixed point arithmetics for values 0...1.
> It is used to maintain probabilities of edges out of basic blocks.
> In addition it tracks information about quality of the estimate which can
> later be used for optimization.

RISC-V (--enable-languages=c,c++ --target=riscv32-unknown-linux-gnu
--without-headers --disable-threads) fails to build right now,
probably only missing header or header order:

[...]
g++ -fno-PIE -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
-Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual 
-pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
-fno-common  -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc 
-I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace   -o riscv.o -MT riscv.o -MMD -MP 
-MF ./.deps/riscv.TPo /home/jbglaw/repos/gcc/gcc/config/riscv/riscv.c
In file included from /home/jbglaw/repos/gcc/gcc/config/riscv/riscv.c:56:0:
/home/jbglaw/repos/gcc/gcc/dojump.h:61:10: error: ‘profile_probability’ has not 
been declared
  profile_probability prob);
  ^
/home/jbglaw/repos/gcc/gcc/dojump.h:63:5: error: ‘profile_probability’ has not 
been declared
 profile_probability);
 ^
/home/jbglaw/repos/gcc/gcc/dojump.h:66:54: error: ‘profile_probability’ has not 
been declared
 extern void jumpif (tree exp, rtx_code_label *label, profile_probability prob);
  ^
/home/jbglaw/repos/gcc/gcc/dojump.h:68:9: error: ‘profile_probability’ has not 
been declared
 profile_probability);
 ^
/home/jbglaw/repos/gcc/gcc/dojump.h:73:39: error: ‘profile_probability’ has not 
been declared
rtx_code_label *if_true_label, profile_probability prob);
   ^
/home/jbglaw/repos/gcc/gcc/dojump.h:75:28: error: ‘profile_probability’ has not 
been declared
  rtx_code_label *, profile_probability);
^
/home/jbglaw/repos/gcc/gcc/dojump.h:79:28: error: ‘profile_probability’ has not 
been declared
  rtx_code_label *, profile_probability);
^
In file included from /home/jbglaw/repos/gcc/gcc/config/riscv/riscv.c:61:0:
/home/jbglaw/repos/gcc/gcc/expr.h:291:63: error: ‘profile_probability’ has not 
been declared
 extern int try_casesi (tree, tree, tree, tree, rtx, rtx, rtx, 
profile_probability);
   ^
/home/jbglaw/repos/gcc/gcc/expr.h:292:61: error: ‘profile_probability’ has not 
been declared
 extern int try_tablejump (tree, tree, tree, tree, rtx, rtx, 
profile_probability);
 ^
In file included from /home/jbglaw/repos/gcc/gcc/config/riscv/riscv.c:63:0:
/home/jbglaw/repos/gcc/gcc/optabs.h:251:10: error: ‘profile_probability’ has 
not been declared
  profile_probability prob
  ^
/home/jbglaw/repos/gcc/gcc/optabs.h:252:8: error: ‘profile_probability’ has not 
been declared
  = profile_probability::uninitialized ());
^
Makefile:2259: recipe for target 'riscv.o' failed
make[1]: *** [riscv.o] Error 1
make[1]: Leaving directory 
'/home/jbglaw/build/riscv32-unknown-linux-gnu/build-gcc/gcc'
Makefile:4286: recipe for target 'all-gcc' failed
make: *** [all-gcc] Error 2

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: They that give up essential liberty to obtain temporary safety,
the second  : deserve neither liberty nor safety.  (Ben Franklin)


signature.asc
Description: Digital signature


[PATCH] Fix wrong-code aggregate propagate_with_phi bug (PR tree-optimization/81365)

2017-07-13 Thread Jakub Jelinek
Hi!

As mentioned in the PR, for aggregate copies we fail to verify there
are no loads in between the PHI and the aggregate copy that could alias with the
lhs of the copy, which is needed, because we want to hoist the aggregate
copy onto predecessor edges of the bb with the PHI.

The following patch implements that, bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk and after a while 7.x?

2017-07-13  Jakub Jelinek  

PR tree-optimization/81365
* tree-ssa-phiprop.c (propagate_with_phi): When considering hoisting
aggregate moves onto bb predecessor edges, make sure there are no
loads that could alias the lhs in between the start of bb and the
loads from *phi.  If there are any debug stmts that could alias,
reset them.

* g++.dg/torture/pr81365.C: New test.

--- gcc/tree-ssa-phiprop.c.jj   2017-05-22 10:50:11.0 +0200
+++ gcc/tree-ssa-phiprop.c  2017-07-11 16:52:41.012340615 +0200
@@ -327,7 +327,7 @@ propagate_with_phi (basic_block bb, gphi
   if (!dominated_by_p (CDI_POST_DOMINATORS,
   bb, gimple_bb (use_stmt)))
continue;
- 
+
   /* Check whether this is a load of *ptr.  */
   if (!(is_gimple_assign (use_stmt)
&& gimple_assign_rhs_code (use_stmt) == MEM_REF
@@ -356,6 +356,9 @@ propagate_with_phi (basic_block bb, gphi
  insert aggregate copies on the edges instead.  */
   if (!is_gimple_reg_type (TREE_TYPE (TREE_TYPE (ptr
{
+ if (!gimple_vdef (use_stmt))
+   goto next;
+
  /* As we replicate the lhs on each incoming edge all
 used SSA names have to be available there.  */
  if (! for_each_index (gimple_assign_lhs_ptr (use_stmt),
@@ -363,6 +366,51 @@ propagate_with_phi (basic_block bb, gphi
get_immediate_dominator (CDI_DOMINATORS,
 gimple_bb (phi
goto next;
+
+ gimple *vuse_stmt;
+ imm_use_iterator vui;
+ use_operand_p vuse_p;
+ bool debug_use_seen = false;
+ /* In order to move the aggregate copies earlier, make sure
+there are no statements that could read from memory
+aliasing the lhs in between the start of bb and use_stmt.
+As we require use_stmt to have a VDEF above, loads after
+use_stmt will use a different virtual SSA_NAME.  */
+ FOR_EACH_IMM_USE_FAST (vuse_p, vui, vuse)
+   {
+ vuse_stmt = USE_STMT (vuse_p);
+ if (vuse_stmt == use_stmt)
+   continue;
+ if (!dominated_by_p (CDI_DOMINATORS,
+  gimple_bb (vuse_stmt), bb))
+   continue;
+ if (ref_maybe_used_by_stmt_p (vuse_stmt,
+   gimple_assign_lhs (use_stmt)))
+   {
+ if (is_gimple_debug (vuse_stmt))
+   debug_use_seen = true;
+ else
+   goto next;
+   }
+   }
+ /* Debug stmt uses should not prevent the transformation, but
+if we saw any, reset those debug stmts.  */
+ if (debug_use_seen)
+   FOR_EACH_IMM_USE_STMT (vuse_stmt, vui, vuse)
+ {
+   if (!is_gimple_debug (vuse_stmt))
+ continue;
+   if (!dominated_by_p (CDI_DOMINATORS,
+gimple_bb (vuse_stmt), bb))
+ continue;
+   if (ref_maybe_used_by_stmt_p (vuse_stmt,
+ gimple_assign_lhs (use_stmt)))
+ {
+   gimple_debug_bind_reset_value (vuse_stmt);
+   update_stmt (vuse_stmt);
+ }
+ }
+
  phiprop_insert_phi (bb, phi, use_stmt, phivn, n);
 
  /* Remove old stmt.  The phi is taken care of by DCE.  */
--- gcc/testsuite/g++.dg/torture/pr81365.C.jj   2017-07-11 17:07:11.107130111 
+0200
+++ gcc/testsuite/g++.dg/torture/pr81365.C  2017-07-11 17:06:52.0 
+0200
@@ -0,0 +1,39 @@
+// PR tree-optimization/81365
+// { dg-do run }
+
+struct A { unsigned a; };
+
+struct B {
+  B (const A *x)
+  {
+__builtin_memcpy (b, x, 3 * sizeof (A));
+__builtin_memcpy (c, x + 3, sizeof (A));
+__builtin_memset (c + 1, 0, sizeof (A));
+  }
+  bool
+  foo (unsigned x)
+  {
+A *it = c;
+if (it->a == x || (++it)->a == x)
+  {
+   A t(b[0]);
+   b[0] = *it;
+   *it = t;
+   return true;
+  }
+return false;
+  }
+  A b[3];
+  A c[2];
+};
+
+int
+main ()
+{
+  A x[] = { 4, 8, 12, 18 };
+  B y(x);
+  if (!y.foo (18))
+__builtin_abort ();
+  if (!y.foo (4))
+__builtin_abort ();
+}

Jakub


Re: [PATCH] gcc/doc: list what version each attribute was introduced in

2017-07-13 Thread Eric Gallager
On 7/7/17, Jeff Law  wrote:
> On 07/06/2017 07:25 AM, Daniel P. Berrange wrote:
>> There are several hundred named attribute keys that have been
>> introduced over many GCC releases. Applications typically need
>> to be compilable with multiple GCC versions, so it is important
>> for developers to know when GCC introduced support for each
>> attribute.
>>
>> This augments the texi docs that list attribute keys with
>> a note of what version introduced the feature. The version
>> information was obtained through archaeology of the GCC source
>> repository release tags, back to gcc-4_0_0-release. For
>> attributes added in 4.0.0 or later, an explicit version will
>> be noted. Any attribute that predates 4.0.0 will simply note
>> that it has existed prior to 4.0.0. It is thought there is
>> little need to go further back in time than 4.0.0 since few,
>> if any, apps will still be using such old compiler versions.
>>
>> Where a named attribute can be used in many contexts (ie the
>> 'visibility' attribute can be used for both functions or
>> variables), it was assumed that the attribute was supported
>> in all use contexts at the same time.
>>
>> Future patches that add new attributes to GCC should be
>> required to follow this new practice, by documenting the
>> version.
> Keying on version #s is generally a terrible way to make your code
> portable.  It's easy to get wrong and due to backporting there's not
> always a strong tie between a version number and the existence of a
> particular feature.
>
> It's far better to actually *test* what your particular compiler
> compiler supports.  I suspect autoconf, for example, probably has some
> infrastructure for testing if specific attributes are supported by the
> compiler.
>
> Jeff
>

gcc sources themselves have tests for attribute availability based on
gcc version number; see include/ansidecl.h:
https://gcc.gnu.org/viewcvs/gcc/trunk/include/ansidecl.h?revision=248205=markup
(this is instead of doing things the autoconf way)


Re: c-family PATCH to improve -Wsign-compare (PR c/81417)

2017-07-13 Thread David Malcolm
On Thu, 2017-07-13 at 18:33 +0200, Marek Polacek wrote:
> A tiny patch for -Wsign-compare so that is also prints the types when
> reporting a warning.
> 
> Bootstrapped/regtested on x86_64-linux and ppc64le-redhat-linux, ok
> for trunk?

Looks like it always display the types in the order signed then
unsigned, which matches the text of the diagnostic, but not necessarily
the ordering within the expression, which might be confusing if
someone's comparing e.g.

  unsigned_a < signed_b

But we already hardcode the ordering within the text of the diagnostic,
so that feels excessively nit-picky.

OK for trunk (with my "diagnostic messages" maintainer hat on).

Thanks
Dave

> 2017-07-13  Marek Polacek  
> 
>   PR c/81417
>   * c-warn.c (warn_for_sign_compare): Print the types.
> 
>   * c-c++-common/Wsign-compare-1.c: New test.
> 
> diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
> index b9378c2dbe2..c903c080a33 100644
> --- gcc/c-family/c-warn.c
> +++ gcc/c-family/c-warn.c
> @@ -1891,9 +1891,10 @@ warn_for_sign_compare (location_t location,
>  c_common_signed_type
> (base_type)))
>   /* OK */;
>else
> - warning_at (location,
> - OPT_Wsign_compare,
> - "comparison between signed and unsigned integer
> expressions");
> + warning_at (location, OPT_Wsign_compare,
> + "comparison between signed and unsigned integer
> "
> + "expressions: %qT and %qT", TREE_TYPE (sop),
> + TREE_TYPE (uop));
>  }
>  
>/* Warn if two unsigned values are being compared in a size larger
> diff --git gcc/testsuite/c-c++-common/Wsign-compare-1.c
> gcc/testsuite/c-c++-common/Wsign-compare-1.c
> index e69de29bb2d..e53f87aa9a3 100644
> --- gcc/testsuite/c-c++-common/Wsign-compare-1.c
> +++ gcc/testsuite/c-c++-common/Wsign-compare-1.c
> @@ -0,0 +1,27 @@
> +/* PR c/81417 */
> +/* { dg-do compile } */
> +/* { dg-options "-Wsign-compare" } */
> +
> +int
> +fn1 (signed int a, unsigned int b)
> +{
> +  return a < b; /* { dg-warning "comparison between signed and
> unsigned integer expressions: 'int' and 'unsigned int'" } */
> +}
> +
> +int
> +fn2 (signed int a, unsigned int b)
> +{
> +  return b < a; /* { dg-warning "comparison between signed and
> unsigned integer expressions: 'int' and 'unsigned int'" } */
> +}
> +
> +int
> +fn3 (signed long int a, unsigned long int b)
> +{
> +  return b < a; /* { dg-warning "comparison between signed and
> unsigned integer expressions: 'long int' and 'long unsigned int'" }
> */
> +}
> +
> +int
> +fn4 (signed short int a, unsigned int b)
> +{
> +  return b < a; /* { dg-warning "comparison between signed and
> unsigned integer expressions: 'short int' and 'unsigned int'" } */
> +}
> 
>   Marek


Re: [DOC PATCH, i386] Fix PR 81294, _subborrow_u64 argument order inconsistent with intrinsic reference

2017-07-13 Thread Jakub Jelinek
On Thu, Jul 13, 2017 at 09:26:25PM +0200, Uros Bizjak wrote:
> IMO, we should change only gcc-7 release branch, so developers can
> test for gcc-7.2+ to determine which release swapped arguments of
> mentioned intrinsics.

Agreed.

> The attached doc patch adds commented-out generic gcc-7.2 entry plus
> the above target-specific change.
> 
> OK for branch?

Ok for wwwdocs, with a nit:

> Index: htdocs/gcc-7/changes.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
> retrieving revision 1.85
> diff -u -r1.85 changes.html
> --- htdocs/gcc-7/changes.html 10 May 2017 11:39:41 -  1.85
> +++ htdocs/gcc-7/changes.html 13 Jul 2017 19:19:02 -
> @@ -1230,5 +1230,25 @@
>  complete (that is, it is possible that some PRs that have been fixed
>  are not listed here).
>  
> +
> +
> +
>  
>  


Jakub


Re: gotools patch committed: Test runtime, misc/cgo/{test,testcarchive}

2017-07-13 Thread Ian Lance Taylor
On Thu, Jun 29, 2017 at 11:40 PM, Uros Bizjak  wrote:
>
>> This patch to the gotools Makefile adds tests to `make check`.  We now
>> test the runtime package using the newly built go tool, and test that
>> cgo works by running the misc/cgo/test and misc/cgo/testcarchive
>> tests.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
>> Committed to mainline.
>
> There are several failures on non-split targets, e.g.:
>
> FAIL: TestCgoHandlesWlORIGIN
> go_test.go:267: running testgo [build origin]
> go_test.go:286: standard error:
> go_test.go:287: # origin
> cc1: error: '-fsplit-stack' requires assembler support for CFI
> directives
> cc1: error: '-fsplit-stack' is not supported by this compiler
> configuration
>
> and:
>
> FAIL: TestCgoCrashHandler
> crash_test.go:70: building testprogcgo []: exit status 2
> # _/home/uros/git/gcc/libgo/go/runtime/testdata/testprogcgo
> cc1: error: '-fsplit-stack' requires assembler support for CFI
> directives
> cc1: error: '-fsplit-stack' is not supported by this compiler
> configuration
>
> As evident from TestBuildDryRunWithCgo dump, -fsplit-stack argument is
> added unconditionally to the compile flags.

Would you mind checking whether this patch fixes the problem on your
system?  Thanks.

Ian
diff --git a/libgo/go/cmd/go/build.go b/libgo/go/cmd/go/build.go
index 72265efa..9623b9c3 100644
--- a/libgo/go/cmd/go/build.go
+++ b/libgo/go/cmd/go/build.go
@@ -3092,8 +3092,7 @@ func (tools gccgoToolchain) cc(b *builder, p *Package, 
objdir, ofile, cfile stri
if pkgpath := gccgoCleanPkgpath(p); pkgpath != "" {
defs = append(defs, `-D`, `GOPKGPATH="`+pkgpath+`"`)
}
-   switch goarch {
-   case "386", "amd64":
+   if b.gccSupportsFlag("-fsplit-stack") {
defs = append(defs, "-fsplit-stack")
}
defs = tools.maybePIC(defs)
@@ -3428,8 +3427,7 @@ func (b *builder) cgo(a *action, cgoExe, obj string, 
pcCFLAGS, pcLDFLAGS, cgofil
}
 
if _, ok := buildToolchain.(gccgoToolchain); ok {
-   switch goarch {
-   case "386", "amd64":
+   if b.gccSupportsFlag("-fsplit-stack") {
cgoCFLAGS = append(cgoCFLAGS, "-fsplit-stack")
}
cgoflags = append(cgoflags, "-gccgo")


Re: [PATCH] match.pd: reassociate multiplications with constants

2017-07-13 Thread Alexander Monakov
On Thu, 13 Jul 2017, Marc Glisse wrote:
> I notice that we do not turn (X*10)*10 into X*100 in GIMPLE.

Sorry, could you clarify what you mean here?  I think we certainly do that,
just not via match.pd, but in 'associate:' case of fold_binary_loc.

> Relying on inner expressions being folded can be slightly dangerous,
> especially for generic IIRC. It seems easy enough to check that @1 is neither
> 0 nor -1 for safety.

I wanted to add a gcc_checking_assert to that effect, but it's not used in
match.pd anywhere.  Is there a nice way to do that?

Thanks!
Alexander


[committed] diagnostics: fix crash when consolidating out-of-order fix-it hints (PR c/81405)

2017-07-13 Thread David Malcolm
PR c/81405 identifies a crash when printing fix-it hints from
-Wmissing-braces when there are excess elements.

The fix-it hints are bogus (which I've filed separately as PR c/81432),
but they lead to a crash within the fix-it consolidation logic I added
in r247548, in line_corrections::add_hint.

The root cause is that some of the fix-it hints are out-of-order
with respect to the column numbers they affect, which can lead to negative
values when computing the gap between the fix-it hints, leading to bogus
memcpy calls that generate out-of-bounds buffer accesses.

The fix is to sort the fix-it hints after filtering them, ensuring that
the gap >= 0.  The patch also adds numerous assertions to the code, both
directly, and by moving the memcpy calls and their args behind
interfaces (themselves containing gcc_assert).

This fixes the crash; it doesn't fix the bug in -Wmissing-braces that
leads to the bogus hints.

Successfully bootstrapped on x86_64-pc-linux-gnu.

Committed to trunk as r250187.

gcc/ChangeLog:
PR c/81405
* diagnostic-show-locus.c (fixit_cmp): New function.
(layout::layout): Sort m_fixit_hints.
(column_range::column_range): Assert that the values are valid.
(struct char_span): New struct.
(correction::overwrite): New method.
(struct source_line): New struct.
(line_corrections::add_hint): Add assertions.  Reimplement memcpy
calls in terms of classes source_line and char_span, and
correction::overwrite.
(selftest::test_overlapped_fixit_printing_2): New function.
(selftest::diagnostic_show_locus_c_tests): Call it.

gcc/testsuite/ChangeLog:
PR c/81405
* gcc.dg/Wmissing-braces-fixits.c: Add coverage for PR c/81405.  */
---
 gcc/diagnostic-show-locus.c   | 193 --
 gcc/testsuite/gcc.dg/Wmissing-braces-fixits.c |  25 
 2 files changed, 205 insertions(+), 13 deletions(-)

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 5227400..b0e72e7 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -756,6 +756,16 @@ compatible_locations_p (location_t loc_a, location_t loc_b)
 }
 }
 
+/* Comparator for sorting fix-it hints.  */
+
+static int
+fixit_cmp (const void *p_a, const void *p_b)
+{
+  const fixit_hint * hint_a = *static_cast (p_a);
+  const fixit_hint * hint_b = *static_cast (p_b);
+  return hint_a->get_start_loc () - hint_b->get_start_loc ();
+}
+
 /* Implementation of class layout.  */
 
 /* Constructor for class layout.
@@ -799,6 +809,9 @@ layout::layout (diagnostic_context * context,
m_fixit_hints.safe_push (hint);
 }
 
+  /* Sort m_fixit_hints.  */
+  m_fixit_hints.qsort (fixit_cmp);
+
   /* Populate m_line_spans.  */
   calculate_line_spans ();
 
@@ -1385,7 +1398,11 @@ layout::annotation_line_showed_range_p (int line, int 
start_column,
 
 struct column_range
 {
-  column_range (int start_, int finish_) : start (start_), finish (finish_) {}
+  column_range (int start_, int finish_) : start (start_), finish (finish_)
+  {
+/* We must have either a range, or an insertion.  */
+gcc_assert (start <= finish || finish == start - 1);
+  }
 
   bool operator== (const column_range ) const
   {
@@ -1427,6 +1444,26 @@ get_printed_columns (const fixit_hint *hint)
 }
 }
 
+/* A struct capturing the bounds of a buffer, to allow for run-time
+   bounds-checking in a checked build.  */
+
+struct char_span
+{
+  char_span (const char *ptr, size_t n_elts) : m_ptr (ptr), m_n_elts (n_elts) 
{}
+
+  char_span subspan (int offset, int n_elts)
+  {
+gcc_assert (offset >= 0);
+gcc_assert (offset < (int)m_n_elts);
+gcc_assert (n_elts >= 0);
+gcc_assert (offset + n_elts <= (int)m_n_elts);
+return char_span (m_ptr + offset, n_elts);
+  }
+
+  const char *m_ptr;
+  size_t m_n_elts;
+};
+
 /* A correction on a particular line.
This describes a plan for how to print one or more fixit_hint
instances that affected the line, potentially consolidating hints
@@ -1455,6 +1492,14 @@ struct correction
   void ensure_capacity (size_t len);
   void ensure_terminated ();
 
+  void overwrite (int dst_offset, const char_span _span)
+  {
+gcc_assert (dst_offset >= 0);
+gcc_assert (dst_offset + src_span.m_n_elts < m_alloc_sz);
+memcpy (m_text + dst_offset, src_span.m_ptr,
+   src_span.m_n_elts);
+  }
+
   /* If insert, then start: the column before which the text
  is to be inserted, and finish is offset by the length of
  the replacement.
@@ -1526,6 +1571,26 @@ line_corrections::~line_corrections ()
 delete c;
 }
 
+/* A struct wrapping a particular source line, allowing
+   run-time bounds-checking of accesses in a checked build.  */
+
+struct source_line
+{
+  source_line (const char *filename, int line);
+
+  char_span as_span () { return char_span (chars, width); }
+
+  const char *chars;
+  int width;
+};
+
+/* source_line's ctor.  */
+

Re: [DOC PATCH, i386] Fix PR 81294, _subborrow_u64 argument order inconsistent with intrinsic reference

2017-07-13 Thread Uros Bizjak
On Tue, Jul 4, 2017 at 10:51 PM, Jakub Jelinek  wrote:
> On Tue, Jul 04, 2017 at 10:41:26PM +0200, Uros Bizjak wrote:
>> Hello!
>>
>> Apparently, Intel changed operand order with the new intrinsic
>> reference release version. Attached patch updates gcc intrinsic
>> headers accordingly.
>>
>> 2017-07-04  Uros Bizjak  
>>
>> PR target/81294
>> * config/i386/adxintrin.h (_subborrow_u32): Swap _X and _Y
>> arguments in the call to __builtin_ia32_sbb_u32.
>> (_subborrow_u64): Swap _X and _Y arguments in the call to
>> __builtin_ia32_sbb_u64.
>>
>> testsuite/ChangeLog:
>>
>> 2017-07-04  Uros Bizjak  
>>
>> PR target/81249
>> * gcc.target/i386/adx_addcarryx32-2.c (adx_test): Swap
>> x and y arguments in the call to _subborrow_u32.
>> * gcc.target/i386/adx_addcarryx64-2.c (adx_test): Swap
>> x and y arguments in the call to _subborrow_u64.
>> * gcc.target/i386/pr81294-1.c: New test.
>> * gcc.target/i386/pr81294-2.c: Ditto.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>>
>> Committed to mainline SVN, willbe backported to release branches.
>
> When it goes into release branches, it needs changes.html changes
> for each of those to make users aware of that.
> 7.2 will have rc after mid July, so if you want it in 7.2, it should be
> committed before that.  5.5 + 5.x branch closing will be around that time
> too.

IMO, we should change only gcc-7 release branch, so developers can
test for gcc-7.2+ to determine which release swapped arguments of
mentioned intrinsics.

The attached doc patch adds commented-out generic gcc-7.2 entry plus
the above target-specific change.

OK for branch?

Uros.
Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.85
diff -u -r1.85 changes.html
--- htdocs/gcc-7/changes.html   10 May 2017 11:39:41 -  1.85
+++ htdocs/gcc-7/changes.html   13 Jul 2017 19:19:02 -
@@ -1230,5 +1230,25 @@
 complete (that is, it is possible that some PRs that have been fixed
 are not listed here).
 
+
+
+
 
 


Re: [PATCH] match.pd: reassociate multiplications with constants

2017-07-13 Thread Marc Glisse

On Thu, 13 Jul 2017, Alexander Monakov wrote:


This is a followup to https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01545.html

Recently due to a fix for PR 80800 GCC has lost the ability to reassociate
signed multiplications chains to go from 'X * CST1 * Y * CST2'
to 'X * Y * (CST1 * CST2)'.  The fix to that PR prevents extract_muldiv from
introducing '(X * (CST1 * CST2)) * Y', which was wrong because it may cause
intermediate signed overflow (unexpected if Y == 0).

As mentioned in that thread, we can reassociate constants to outermost operands
instead: this is safe because CST1 cannot be 0 or -1, since those are handled
by other match.pd rules.

(in fact it's possible to reassociate negates too, and go from '(-X) * Y * CST'
to '(X * Y) * (-CST)' if (-CST) doesn't overflow (again, we know that CST != 1),
but I'm not sure how valuable that is in practice, so opted not to do that yet)

The following patch reinstates folding by adding a new match.pd rule that moves
constants to outermost operands, where they can be merged by fold_binary
machinery if their product doesn't overflow.  Bootstrapped and regtested on
amd64, OK for trunk?

* match.pd ((X * CST) * Y): Reassociate to (X * Y) * CST.

diff --git a/gcc/match.pd b/gcc/match.pd
index 4c64b21..e49f879 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2139,6 +2139,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (mult @0 integer_minus_onep)
 (negate @0))

+/* Reassociate (X * CST) * Y to (X * Y) * CST.  This does not introduce
+   signed overflow: previous rules handle CST being -1 or 0, and for
+   the rest we know that if X * Y overflows, so does (X * CST) * Y.  */
+(if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_SATURATING (type))
+ (simplify
+  (mult:c (mult @0 INTEGER_CST@1) @2)
+  (if (TREE_CODE (@2) != INTEGER_CST)
+   (mult (mult @0 @2) @1
+
/* True if we can easily extract the real and imaginary parts of a complex
   number.  */
(match compositional_complex


Thanks. I guess it makes sense as a canonicalization (there are always 
cases where those make it harder to optimize, but hopefully fewer than 
those where they help).


I notice that we do not turn (X*10)*10 into X*100 in GIMPLE. X*big*big 
where abs(big*big)>abs(INT_MIN) can be optimized to 0, the only hard case 
is when the product of the constants is -INT_MIN, which we could turn into 
X<<31 for instance (sadly loses range info), or (-X)*INT_MIN or whatever. 
That would make a nice follow-up, if you are interested.


Relying on inner expressions being folded can be slightly dangerous, 
especially for generic IIRC. It seems easy enough to check that @1 is 
neither 0 nor -1 for safety.


Probably needs :s on the inner multiplication.

Unless the test on TYPE_OVERFLOW_SANITIZED etc is shared with adjacent 
transformations, I'd rather put it inside, with the other if, but that's a 
matter of taste.


One small testcase please? Or is there already one that is currently 
failing?


(I cannot ok patches, only comment)

--
Marc Glisse


Home Owners List

2017-07-13 Thread Donna Bombardier


Hi,

I'm Donna.

Would you be interested in reaching out to "Newly Home Owners List" with opt-in 
and licensed to be sold?

We also have data for Home Owners List,Mortgage with Home Owners List,Real 
Estate Agents/Brokers List,Condo Owners List,Real Estate Investors list
Rentals List,Pool Owners List,Car Owners,HNI List and many more..

Each record we will provide you with: Contact Name (First, Middle and Last 
name), Mailing Address, List Types, IP Address, Source and Emails Address.

Please let me know your thoughts towards procuring or using our Home Owners 
List.

Please brief us on the type of emails you wish to target.

Best Regards,
Donna Bombardier

We respect your privacy, if you do not wish to receive any further emails from 
our end, please reply with a subject “Leave Out”.


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Re: C PATCH to display types when printing a conversion warning (PR c/81233)

2017-07-13 Thread Martin Sebor

On 07/13/2017 08:18 AM, Marek Polacek wrote:

This patch improves diagnostic in the C FE by printing the types when reporting
a problem with a conversion.  E.g., instead of

   warning: assignment from incompatible pointer type

you'll now get

  warning: assignment to 'int *' from incompatible pointer type 'char *'

or instead of

  warning: initialization makes integer from pointer without a cast

this

   warning: initialization of 'int *' from 'int' makes pointer from integer 
without a cast

I've been wanting this for a long time and here it is.  Two snags: I had to
make pedwarn_init to take '...' for which I had to introduce
emit_diagnostic_valist; you can't pass varargs from one vararg function to
another vararg function (and a macro with __VA_ARGS__ didn't work here).  Also,
PEDWARN_FOR_ASSIGNMENT didn't work with the addition of printing TYPE and
RHSTYPE so I just decided to unroll the macro instead of making it even more
ugly.  This patch is long but it's mainly because of the testsuite fallout.

If you have better ideas about the wording, let me know.


It looks pretty good as is.  My only wording suggestion is to
consider simply mentioning conversion in the text of the warnings:

  warning: conversion to T* from an incompatible type U*

I'm not sure that being explicit about the context where the bad
conversion takes place (initialization vs assignment vs returning
a value) is terribly helpful.  That would not only simplify the
code and make all the messages consistent, but it would also make
it possible to get rid of the note when passing arguments.



There are still more warnings to improve but I think better to do this
incrementally rather than a single humongous patch.


That makes sense.  I was going to mention that it would be nice
to also improve:

  warning: comparison of distinct pointer types lacks a cast

If you take the conversion suggestion I think this warning would
need to be phrased in terms of "conversion between T* and U*"
rather than "conversion from T* to U*".  (A similar change could
be made to the error message printed when incompatible pointers
are subtracted from one another.)

Martin



Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Jakub Jelinek
On Thu, Jul 13, 2017 at 11:28:17AM -0600, Jeff Law wrote:
> On 07/12/2017 04:44 PM, Segher Boessenkool wrote:
> > On Tue, Jul 11, 2017 at 03:19:36PM -0600, Jeff Law wrote:
> >> Examples of implicit probes include
> > 
> >>   2. ABI mandates that *sp always contain a backchain pointer (ppc)
> > 
> > In the ELFv2 ABI a backchain is not required.  GCC still always has
> > one afaik.  I'll find out more.
> Please do.  I was under the impression it was mandated by the earlier
> ABIs as well.  If it isn't, then I don't think we can depend on it for
> the older ABIs.
> 
> That wouldn't be the end of the world -- it's pretty clear that ppc64le
> is the future and we'd get good code there.  I wouldn't lose much sleep
> if ppc32 and ppc64 big endian had a less efficient probing scheme.

??  Segher said in ELFv2 ABI it is not required, so that would mean
it does affect ppc64le and does not affect ppc32 or ppc64.
So, we wouldn't get good code for ppc64le and would get one for ppc32 and
ppc64.

Jakub


Re: [rs6000] Avoid rotates of floating-point modes

2017-07-13 Thread Segher Boessenkool
Hi Richard,

On Wed, Jul 12, 2017 at 05:33:42PM +0100, Richard Sandiford wrote:
> The little-endian VSX code uses rotates to swap the two 64-bit halves of
> 128-bit scalar modes.  This is fine for TImode and V1TImode, but it
> isn't really valid to use RTL rotates on floating-point modes like
> KFmode and TFmode, and doing that triggered an assert added by the
> SVE series.  This patch uses bit-casts to V1TImode instead.
> 
> Tested on powerpc64le-linux-gnu.  OK to install?


> +void
> +rs6000_emit_le_vsx_permute (rtx dest, rtx source, machine_mode mode)
>  {
>/* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and
>   128-bit integers if they are allowed in VSX registers.  */
> -  if (FLOAT128_VECTOR_P (mode) || mode == TImode || mode == V1TImode)
> -return gen_rtx_ROTATE (mode, source, GEN_INT (64));
> +  if (FLOAT128_VECTOR_P (mode))
> +{
> +  dest = gen_lowpart (V1TImode, dest);
> +  source = gen_lowpart (V1TImode, source);
> +  mode = V1TImode;
> +}

Add an empty line here?  And maybe a comment.

> +  if (mode == TImode || mode == V1TImode)
> +emit_insn (gen_rtx_SET (dest, gen_rtx_ROTATE (mode, source,
> +   GEN_INT (64;
>else
>  {
>rtx par = gen_rtx_PARALLEL (VOIDmode, rs6000_const_vec (mode));
> -  return gen_rtx_VEC_SELECT (mode, source, par);
> +  emit_insn (gen_rtx_SET (dest, gen_rtx_VEC_SELECT (mode, source, par)));
>  }
>  }

> --- gcc/config/rs6000/vsx.md  2017-06-30 12:50:38.889632907 +0100
> +++ gcc/config/rs6000/vsx.md  2017-07-12 16:30:38.734631598 +0100
> @@ -37,6 +37,10 @@ (define_mode_iterator VSX_LE_128 [(KF
> (TI   "TARGET_VSX_TIMODE")
> V1TI])
>  
> +;; Same, but with just the integer modes.
> +(define_mode_iterator VSX_LE_128I [(TI   "TARGET_VSX_TIMODE")
> +V1TI])

I don't like that name much.  The difference between VSX_LE_128 and
VSX_LE_128I is easy to overlook (and what _is_ the difference?  "I"
means "integer" I guess?).  The "LE" in the name has no real meaning
(it is used for LE, sure, but that doesn't matter for the iterator).
Maybe just VSX_TI?  Or is that too short.

Other than that, looks fine.  Thank you for the patch!

Does this need backports?


Segher


Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Jeff Law
On 07/13/2017 11:32 AM, Jakub Jelinek wrote:
> On Thu, Jul 13, 2017 at 11:28:17AM -0600, Jeff Law wrote:
>> On 07/12/2017 04:44 PM, Segher Boessenkool wrote:
>>> On Tue, Jul 11, 2017 at 03:19:36PM -0600, Jeff Law wrote:
 Examples of implicit probes include
>>>
   2. ABI mandates that *sp always contain a backchain pointer (ppc)
>>>
>>> In the ELFv2 ABI a backchain is not required.  GCC still always has
>>> one afaik.  I'll find out more.
>> Please do.  I was under the impression it was mandated by the earlier
>> ABIs as well.  If it isn't, then I don't think we can depend on it for
>> the older ABIs.
>>
>> That wouldn't be the end of the world -- it's pretty clear that ppc64le
>> is the future and we'd get good code there.  I wouldn't lose much sleep
>> if ppc32 and ppc64 big endian had a less efficient probing scheme.
> 
> ??  Segher said in ELFv2 ABI it is not required, so that would mean
> it does affect ppc64le and does not affect ppc32 or ppc64.
> So, we wouldn't get good code for ppc64le and would get one for ppc32 and
> ppc64.
Opps.  Mis-read.  Got it totally backwards

Not good.  Waiting on Segher for clarification, but will start thinking
about better options than punting :-)



jeff


Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Jeff Law
On 07/12/2017 04:44 PM, Segher Boessenkool wrote:
> On Tue, Jul 11, 2017 at 03:19:36PM -0600, Jeff Law wrote:
>> Examples of implicit probes include
> 
>>   2. ABI mandates that *sp always contain a backchain pointer (ppc)
> 
> In the ELFv2 ABI a backchain is not required.  GCC still always has
> one afaik.  I'll find out more.
Please do.  I was under the impression it was mandated by the earlier
ABIs as well.  If it isn't, then I don't think we can depend on it for
the older ABIs.

That wouldn't be the end of the world -- it's pretty clear that ppc64le
is the future and we'd get good code there.  I wouldn't lose much sleep
if ppc32 and ppc64 big endian had a less efficient probing scheme.

We'd set up a last_probe_offset tracker like we do for aarch & s390.
For ppc64le it's initial state would be zero.  For ppc32 and ppc64 big
endian the initial state would be PROBE_OFFSET - STACK_BOUNDARY /
UNITS_PER_WORD.  Depending on cost/benefit analysis we could try to
optimize those ports, but given overall directions it just might not be
worth the effort.

> 
>> To get a sense of overhead, just 1.5% of routines in glibc need probing
>> in their prologues (x86) in the testing I performed.  IIRC each and
>> every one of those routines needed just 1-4 inlined probes.
>>
>> Significantly more functions need alloca space probed (IIRC ~5%), but
>> given the amazingly inefficient alloca code, I can't believe anyone will
>> ever notice the probing overhead.
> 
> That is quite a lot of functions IMO, but it's just one stor per page
> (or per alloca), and supposedly you'll store to that stack anyway (or
> it is stupid slow code in the first place).  Did you measure any real
> timings?
Haven't measured any real timings.  We hit so few functions with the
prologue probes it's hard to see how they could end up being measurable.

THe code we generate for alloca was so awful it's hard to see how
hitting each page once would matter either.  *However* I was looking at
x86 in this case and due to potential stack realignments x86's alloca
code might be notably worse than others for constant sizes.

There's further improvements that could be made as well.   It ought to
be possible to write an optimizer pass that uses some of the ideas from
DSE and SLSR to identify explicit probes that are made redundant by
nearby implicit probes -- this would seem most useful for the dynamic space.

The problem is we'd want to do that in gimple, but probing of the
dynamic space happens at the gimple/rtl border.  So we'd probably want
to make probing happen earlier to expose stuff at the gimple level.


Jeff


[PATCH] update edge profile info in nvptx.c

2017-07-13 Thread Cesar Philippidis
The recent basic block profiling changes broke a couple of libgomp
OpenACC execution tests involving reductions with nvptx offloading. For
gang and worker reductions, the nvptx BE updates the original reduction
variable using a lock-free atomic algorithm. This lock-free algorithm
utilizes a polling loop to check the state of the variable being
updated. This loop introduced a new basic block edge, but it wasn't
assigned a branch probability. Because of the highly threaded nature of
CUDA accelerators, I set the branch probability for that edge as even.

Similarly, for nvptx vector reductions, when it comes time to initialize
the reduction variable, the nvptx BE constructs a branch so that only
vector lanes 1 to vector_length-1 are initialized the the default value
for a given reduction type, where vector lane 0 retains the original
value of the reduction variable. For similar reason to the gang and
worker reductions, I set the probability of the new edge introduced for
the vector reduction to even.

Is this OK for trunk?

Cesar
2017-07-13  Cesar Philippidis  

	gcc
	* config/nvptx/nvptx.c (nvptx_lockless_update): Update edge
	profiling information.
	(nvptx_lockfull_update): Likewise.
	(nvptx_goacc_reduction_init): Likewise.


diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index c8847a5dbba..3a24bd375ca 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4985,6 +4985,7 @@ nvptx_lockless_update (location_t loc, gimple_stmt_iterator *gsi,
 
   post_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU;
   edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_FALSE_VALUE);
+  loop_edge->probability = profile_probability::even ();
   set_immediate_dominator (CDI_DOMINATORS, loop_bb, pre_bb);
   set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb);
 
@@ -5057,7 +5058,8 @@ nvptx_lockfull_update (location_t loc, gimple_stmt_iterator *gsi,
   
   /* Create the lock loop ... */
   locked_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU;
-  make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE);
+  edge e = make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE);
+  e->probability = profile_probability::even ();
   set_immediate_dominator (CDI_DOMINATORS, lock_bb, entry_bb);
   set_immediate_dominator (CDI_DOMINATORS, update_bb, lock_bb);
 
@@ -5211,6 +5213,7 @@ nvptx_goacc_reduction_init (gcall *call)
   
   /* Create false edge from call_bb to dst_bb.  */
   edge nop_edge = make_edge (call_bb, dst_bb, EDGE_FALSE_VALUE);
+  nop_edge->probability = profile_probability::even ();
 
   /* Create phi node in dst block.  */
   gphi *phi = create_phi_node (lhs, dst_bb);


[PATCH] match.pd: reassociate multiplications with constants

2017-07-13 Thread Alexander Monakov
Hi,

This is a followup to https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01545.html

Recently due to a fix for PR 80800 GCC has lost the ability to reassociate
signed multiplications chains to go from 'X * CST1 * Y * CST2'
to 'X * Y * (CST1 * CST2)'.  The fix to that PR prevents extract_muldiv from
introducing '(X * (CST1 * CST2)) * Y', which was wrong because it may cause
intermediate signed overflow (unexpected if Y == 0).

As mentioned in that thread, we can reassociate constants to outermost operands
instead: this is safe because CST1 cannot be 0 or -1, since those are handled
by other match.pd rules.

(in fact it's possible to reassociate negates too, and go from '(-X) * Y * CST'
to '(X * Y) * (-CST)' if (-CST) doesn't overflow (again, we know that CST != 1),
but I'm not sure how valuable that is in practice, so opted not to do that yet)

The following patch reinstates folding by adding a new match.pd rule that moves
constants to outermost operands, where they can be merged by fold_binary
machinery if their product doesn't overflow.  Bootstrapped and regtested on
amd64, OK for trunk?

* match.pd ((X * CST) * Y): Reassociate to (X * Y) * CST.

diff --git a/gcc/match.pd b/gcc/match.pd
index 4c64b21..e49f879 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2139,6 +2139,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (mult @0 integer_minus_onep)
  (negate @0))

+/* Reassociate (X * CST) * Y to (X * Y) * CST.  This does not introduce
+   signed overflow: previous rules handle CST being -1 or 0, and for
+   the rest we know that if X * Y overflows, so does (X * CST) * Y.  */
+(if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_SATURATING (type))
+ (simplify
+  (mult:c (mult @0 INTEGER_CST@1) @2)
+  (if (TREE_CODE (@2) != INTEGER_CST)
+   (mult (mult @0 @2) @1
+
 /* True if we can easily extract the real and imaginary parts of a complex
number.  */
 (match compositional_complex



c-family PATCH to improve -Wsign-compare (PR c/81417)

2017-07-13 Thread Marek Polacek
A tiny patch for -Wsign-compare so that is also prints the types when
reporting a warning.

Bootstrapped/regtested on x86_64-linux and ppc64le-redhat-linux, ok for trunk?

2017-07-13  Marek Polacek  

PR c/81417
* c-warn.c (warn_for_sign_compare): Print the types.

* c-c++-common/Wsign-compare-1.c: New test.

diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
index b9378c2dbe2..c903c080a33 100644
--- gcc/c-family/c-warn.c
+++ gcc/c-family/c-warn.c
@@ -1891,9 +1891,10 @@ warn_for_sign_compare (location_t location,
   c_common_signed_type (base_type)))
/* OK */;
   else
-   warning_at (location,
-   OPT_Wsign_compare,
-   "comparison between signed and unsigned integer 
expressions");
+   warning_at (location, OPT_Wsign_compare,
+   "comparison between signed and unsigned integer "
+   "expressions: %qT and %qT", TREE_TYPE (sop),
+   TREE_TYPE (uop));
 }
 
   /* Warn if two unsigned values are being compared in a size larger
diff --git gcc/testsuite/c-c++-common/Wsign-compare-1.c 
gcc/testsuite/c-c++-common/Wsign-compare-1.c
index e69de29bb2d..e53f87aa9a3 100644
--- gcc/testsuite/c-c++-common/Wsign-compare-1.c
+++ gcc/testsuite/c-c++-common/Wsign-compare-1.c
@@ -0,0 +1,27 @@
+/* PR c/81417 */
+/* { dg-do compile } */
+/* { dg-options "-Wsign-compare" } */
+
+int
+fn1 (signed int a, unsigned int b)
+{
+  return a < b; /* { dg-warning "comparison between signed and unsigned 
integer expressions: 'int' and 'unsigned int'" } */
+}
+
+int
+fn2 (signed int a, unsigned int b)
+{
+  return b < a; /* { dg-warning "comparison between signed and unsigned 
integer expressions: 'int' and 'unsigned int'" } */
+}
+
+int
+fn3 (signed long int a, unsigned long int b)
+{
+  return b < a; /* { dg-warning "comparison between signed and unsigned 
integer expressions: 'long int' and 'long unsigned int'" } */
+}
+
+int
+fn4 (signed short int a, unsigned int b)
+{
+  return b < a; /* { dg-warning "comparison between signed and unsigned 
integer expressions: 'short int' and 'unsigned int'" } */
+}

Marek


Re: [Patch][Aarch64] Refactor comments in aarch64_print_operand

2017-07-13 Thread James Greenhalgh
On Thu, Jul 13, 2017 at 04:35:55PM +0100, Jackson Woodruff wrote:
> Hi James,
> 
> I've addressed the issues discussed below.
> 
> OK for trunk?

I one last comment, otherwise, this looks good:

> +/* Print operand X to file F in a target specific manner according to CODE.
> +   The acceptable formatting commands given by CODE are:
> + 'c':An integer or symbol address without a preceding # sign.
> + 'e':Print the sign/zero-extend size as a character 8->b,
> + 16->h, 32->w.
> + 'p':Prints N such that 2^N == X (X must be power of 2 and
> + const int).
> + 'P':Print the number of non-zero bits in X (a const_int).
> + 'H':Print the higher numbered register of a pair (TImode)
> + of regs.
> + 'm':Print a condition (eq, ne, etc).
> + 'M':Same as 'm', but invert condition.
> + 'b/h/s/d/q':Print a scalar FP/SIMD register name.
> + 'S/T/U/V':  Print the first FP/SIMD register name in a list
> + (No difference between any of these options).

There is a slight difference between these options - You'd use them in a
in a pattern with a large integer mode like LD3 on a CImode value to print
the register list you want to load. For example:

  LD3 {v0.4s - v2.4s} [x0]

The register number you'll get by inspecting REGNO (x) will give you the
start of the register list - but we need to get the right number for the
end of the register list too. To find that offset, we take
(CODE - 'S'). It should be clear why for S/T/U/V this gives 0/1/2/3.

So this comment should read:

  Print a FP/SIMD register name for a register list.  The register
  printed is the FP/SIMD register name of X + 0/1/2/3 for S/T/U/V.

Or similar.

Thanks,
James




Re: [PATCH] Remove Pascal language in source code.

2017-07-13 Thread Pedro Alves
On 07/13/2017 03:59 PM, Jeff Law wrote:

> The only concern I'd have here is the bits in dbxout.[ch] might
> effectively be the best documentation of the dbxout format that exists.
> Thus, dropping something like N_SO_PASCAL loses that historical
> documentation.

FYI, there's a texinfo document in the GDB repo describing the
stabs format, and it documents N_SO_PASCAL:

  
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/doc/stabs.texinfo;h=a7ea808a41b290b7dfc4f44801a540a834ee04db;hb=HEAD#l440

A pre-generated html version is live here: 

  https://www.sourceware.org/gdb/onlinedocs/stabs.html
  https://www.sourceware.org/gdb/onlinedocs/stabs.html#Source-Files

Thanks,
Pedro Alves



Re: [Patch][Aarch64] Refactor comments in aarch64_print_operand

2017-07-13 Thread Jackson Woodruff

Hi James,

I've addressed the issues discussed below.

OK for trunk?

Jackson

On 07/13/2017 10:03 AM, James Greenhalgh wrote:

On Tue, Jul 11, 2017 at 05:29:11PM +0100, Jackson Woodruff wrote:

Hi all,

This patch refactors comments in config/aarch64/aarch64.c
aarch64_print_operand
to provide a table of aarch64 specific formating options.

I've tested the patch with a bootstrap and testsuite run on aarch64.

OK for trunk?

Hi Jackson,

Thanks for the patch, I have a few comments, but overall this looks
like a nice improvement.


Changelog:

gcc/

2017-07-04  Jackson Woodruff  

  * config/aarch64/aarch64.c (aarch64_print_operand):
Move comments to top of function.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
037339d431d80c49699446e548d6b2707883b6a8..91bf4b3e9792e4ba01232f099ed844bdf23392fa
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5053,12 +5053,39 @@ static const int aarch64_nzcv_codes[] =
0   /* NV, Any.  */
  };
  
+/* aarch64 specific string formatting commands:

s/aarch64/AArch64/
s/string/operand/

Most functions in GCC should have a comment describing the arguments they
take as well as what they do, so I suppose I'd prefer to see something like:

/* Print operand X to file F in a target specific manner according to CODE.
The acceptable formatting commands given by CODE are:
[...]


+ 'c':  An integer or symbol address without a preceding # sign.
+ 'e':  Print the sign/zero-extend size as a character 8->b,
+   16->h, 32->w.
+ 'p':  Prints N such that 2^N == X (X must be power of 2 and
+   const int).
+ 'P':  Print the number of non-zero bits in X (a const_int).
+ 'H':  Print the higher numbered register of a pair (TImode)
+   of regs.
+ 'm':  Print a condition (eq, ne, etc).
+ 'M':  Same as 'm', but invert condition.
+ 'b/q/h/s/d':  Print a scalar FP/SIMD register name.

Put these in size order - b/h/s/d/q


+ 'S/T/U/V':Print the first FP/SIMD register name in a list.

It might be useful to expand in this comment what the difference is between
S T U and V.


+ 'R':  Print a scalar FP/SIMD register name + 1.
+ 'X':  Print bottom 16 bits of integer constant in hex.
+ 'w/x':Print a general register name or the zero register
+   (32-bit or 64-bit).
+ '0':  Print a normal operand, if it's a general register,
+   then we assume DImode.
+ 'k':  Print nzcv.

This one doesn't make sense to me and could do with some clarification. Maybe
Print the  field for CCMP.

Thanks,
James


+ 'A':  Output address constant representing the first
+   argument of X, specifying a relocation offset
+   if appropriate.
+ 'L':  Output constant address specified by X
+   with a relocation offset if appropriate.
+ 'G':  Prints address of X, specifying a PC relative
+   relocation mode if appropriate.  */
+


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
037339d431d80c49699446e548d6b2707883b6a8..989429a203aaeb72980b89ecc43adb736019afe6
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5053,12 +5053,41 @@ static const int aarch64_nzcv_codes[] =
   0/* NV, Any.  */
 };
 
+/* Print operand X to file F in a target specific manner according to CODE.
+   The acceptable formatting commands given by CODE are:
+ 'c':  An integer or symbol address without a preceding # sign.
+ 'e':  Print the sign/zero-extend size as a character 8->b,
+   16->h, 32->w.
+ 'p':  Prints N such that 2^N == X (X must be power of 2 and
+   const int).
+ 'P':  Print the number of non-zero bits in X (a const_int).
+ 'H':  Print the higher numbered register of a pair (TImode)
+   of regs.
+ 'm':  Print a condition (eq, ne, etc).
+ 'M':  Same as 'm', but invert condition.
+ 'b/h/s/d/q':  Print a scalar FP/SIMD register name.
+ 'S/T/U/V':Print the first FP/SIMD register name in a list
+   (No difference between any of these options).
+ 'R':  Print a scalar FP/SIMD register name + 1.
+ 'X':  Print bottom 16 bits of integer constant in hex.
+ 'w/x':Print a general register name or the zero register
+   (32-bit or 64-bit).
+ '0':  Print a normal operand, if it's a general register,
+   then we assume 

Re: [PATCH] Remove Pascal language in source code.

2017-07-13 Thread Jeff Law
On 07/11/2017 08:30 AM, Martin Liška wrote:
> Hi.
> 
> Similar for GNU Pascal language.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> 
> Martin
> 
> gcc/ChangeLog:
> 
> 2017-07-11  Martin Liska  
> 
> * dbxout.c (get_lang_number): Do not handle GNU Pascal.
> * dbxout.h (extern void dbxout_stab_value_internal_label_diff):
> Remove N_SO_PASCAL.
> * dwarf2out.c (lower_bound_default): Do not handle
> DW_LANG_Pascal83.
> (gen_compile_unit_die): Likewise.
> * gcc.c: Remove default extension binding for GNU Pascal.
> * stmt.c: Remove Pascal language from a comment.
> * xcoffout.c: Likewise.
The only concern I'd have here is the bits in dbxout.[ch] might
effectively be the best documentation of the dbxout format that exists.
Thus, dropping something like N_SO_PASCAL loses that historical
documentation.

Even with that caveat, I think this should go into the trunk.  In the
unlikely event we really need that historical record, we have svn/git.
jeff


C PATCH to display types when printing a conversion warning (PR c/81233)

2017-07-13 Thread Marek Polacek
This patch improves diagnostic in the C FE by printing the types when reporting
a problem with a conversion.  E.g., instead of 

   warning: assignment from incompatible pointer type

you'll now get

  warning: assignment to 'int *' from incompatible pointer type 'char *'

or instead of

  warning: initialization makes integer from pointer without a cast

this

   warning: initialization of 'int *' from 'int' makes pointer from integer 
without a cast

I've been wanting this for a long time and here it is.  Two snags: I had to
make pedwarn_init to take '...' for which I had to introduce
emit_diagnostic_valist; you can't pass varargs from one vararg function to
another vararg function (and a macro with __VA_ARGS__ didn't work here).  Also,
PEDWARN_FOR_ASSIGNMENT didn't work with the addition of printing TYPE and
RHSTYPE so I just decided to unroll the macro instead of making it even more
ugly.  This patch is long but it's mainly because of the testsuite fallout.

If you have better ideas about the wording, let me know.

There are still more warnings to improve but I think better to do this
incrementally rather than a single humongous patch.

Bootstrapped/regtested on x86_64-linux and powerpc64le-unknown-linux-gnu,
ok for trunk?

2017-07-13  Marek Polacek  

PR c/81233
* c-typeck.c (pedwarn_init): Make the function take a variable list.
Call emit_diagnostic_valist instead of pedwarn.
(convert_for_assignment): Unroll the PEDWARN_FOR_ASSIGNMENT macro.
Print the relevant types in diagnostics.

* diagnostic-core.h (emit_diagnostic_valist): Add declaration.
* diagnostic.c (emit_diagnostic): Add a comment.
(emit_diagnostic_valist): New function.

* gcc.dg/diagnostic-types-1.c: New test.
* gcc.dg/assign-warn-1.c: Update warning messages.
* gcc.dg/assign-warn-2.c: Likewise.
* gcc.dg/c90-const-expr-5.c: Likewise.
* gcc.dg/c99-const-expr-5.c: Likewise.
* gcc.dg/conv-2.c: Likewise.
* gcc.dg/init-bad-7.c: Likewise.
* gcc.dg/overflow-warn-1.c: Likewise.
* gcc.dg/overflow-warn-2.c: Likewise.
* gcc.dg/overflow-warn-3.c: Likewise.
* gcc.dg/overflow-warn-4.c: Likewise.
* gcc.dg/pointer-array-atomic.c: Likewise.
* gcc.dg/pr26865.c: Likewise.
* gcc.dg/pr61162-2.c: Likewise.
* gcc.dg/pr61162.c: Likewise.
* gcc.dg/pr67730-2.c: Likewise.
* gcc.dg/pr69156.c: Likewise.
* gcc.dg/pr70174.c: Likewise.
* objc.dg/proto-lossage-4.m: Likewise.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 4d067e96dd3..742c047f7d1 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -6055,20 +6055,19 @@ error_init (location_t loc, const char *gmsgid)
it is unconditionally given.  GMSGID identifies the message.  The
component name is taken from the spelling stack.  */
 
-static void
-pedwarn_init (location_t loc, int opt, const char *gmsgid)
+static void ATTRIBUTE_GCC_DIAG (3,0)
+pedwarn_init (location_t loc, int opt, const char *gmsgid, ...)
 {
-  char *ofwhat;
-  bool warned;
-
   /* Use the location where a macro was expanded rather than where
  it was defined to make sure macros defined in system headers
  but used incorrectly elsewhere are diagnosed.  */
   source_location exploc = expansion_point_location_if_in_system_header (loc);
 
-  /* The gmsgid may be a format string with %< and %>. */
-  warned = pedwarn (exploc, opt, gmsgid);
-  ofwhat = print_spelling ((char *) alloca (spelling_length () + 1));
+  va_list ap;
+  va_start (ap, gmsgid);
+  bool warned = emit_diagnostic_valist (DK_PEDWARN, exploc, opt, gmsgid, );
+  va_end (ap);
+  char *ofwhat = print_spelling ((char *) alloca (spelling_length () + 1));
   if (*ofwhat && warned)
 inform (exploc, "(near initialization for %qs)", ofwhat);
 }
@@ -6301,17 +6300,33 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
   if (checktype != error_mark_node
  && TREE_CODE (type) == ENUMERAL_TYPE
  && TYPE_MAIN_VARIANT (checktype) != TYPE_MAIN_VARIANT (type))
-   {
- PEDWARN_FOR_ASSIGNMENT (location, expr_loc, OPT_Wc___compat,
- G_("enum conversion when passing argument "
-"%d of %qE is invalid in C++"),
- G_("enum conversion in assignment is "
-"invalid in C++"),
- G_("enum conversion in initialization is "
-"invalid in C++"),
- G_("enum conversion in return is "
-"invalid in C++"));
-   }
+   switch (errtype)
+ {
+ case ic_argpass:
+   if (pedwarn (expr_loc, OPT_Wc___compat, "enum conversion when "
+"passing argument %d of %qE is invalid in C++",
+   

Re: [PATCH] Move static chain and non-local goto init after NOTE_INSN_FUNCTION_BEG (PR sanitize/81186).

2017-07-13 Thread Martin Liška
On 06/30/2017 04:03 PM, Michael Matz wrote:
> So you need to find some other solution of setting up the stack for ASAN.  
> And it'd be best if that solution doesn't require inserting code inside 
> the above sequence of parameter setup instructions, and you certainly 
> can't call any functions inside that sequence.  It might mean that you 
> can't track the static chain place or the nonlocal goto save area.  You 
> also don't track the parameter stack slots, right?

Hi.

Hopefully following patch will fix that. I returned to the first version and
saved/restored static_chain register before/after __asan_stack_malloc.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Thoughts?
Martin
>From b285e7cb1d7f3e35981dec951121db58ce152b3b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 13 Jul 2017 13:37:47 +0200
Subject: [PATCH] Move static chain and non-local goto init after
 NOTE_INSN_FUNCTION_BEG

gcc/ChangeLog:

2017-06-27  Martin Liska  

PR sanitize/81186
	* function.c (expand_function_start): Move static chain and non-local
	goto init after NOTE_INSN_FUNCTION_BEG.
	* asan.c (asan_emit_stack_protection): Preserve static chain
	register if we call __asan_stack_malloc_N.

gcc/testsuite/ChangeLog:

2017-06-27  Martin Liska  

PR sanitize/81186
	* gcc.dg/asan/pr81186.c: New test.
---
 gcc/asan.c  | 12 
 gcc/function.c  | 18 +-
 gcc/testsuite/gcc.dg/asan/pr81186.c | 18 ++
 3 files changed, 39 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/asan/pr81186.c

diff --git a/gcc/asan.c b/gcc/asan.c
index 89c2731e8cd..9cc1d21c1fb 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1340,6 +1340,16 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   emit_cmp_and_jump_insns (ret, const0_rtx, EQ, NULL_RTX,
 			   VOIDmode, 0, lab,
 			   profile_probability::very_likely ());
+  /* Preserve static chain register in order to not have it clobbered in
+	 __asan_stack_malloc_N function.  */
+  rtx chain = targetm.calls.static_chain (current_function_decl, true);
+  rtx saved_chain;
+  if (chain)
+	{
+	  saved_chain = gen_reg_rtx (Pmode);
+	  emit_move_insn (saved_chain, chain);
+	}
+
   snprintf (buf, sizeof buf, "__asan_stack_malloc_%d",
 		use_after_return_class);
   ret = init_one_libfunc (buf);
@@ -1347,6 +1357,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
  GEN_INT (asan_frame_size
 	  + base_align_bias),
  TYPE_MODE (pointer_sized_int_node));
+  if (chain)
+	emit_move_insn (chain, saved_chain);
   /* __asan_stack_malloc_[n] returns a pointer to fake stack if succeeded
 	 and NULL otherwise.  Check RET value is NULL here and jump over the
 	 BASE reassignment in this case.  Otherwise, reassign BASE to RET.  */
diff --git a/gcc/function.c b/gcc/function.c
index f625489205b..5e8a56099a5 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5220,6 +5220,14 @@ expand_function_start (tree subr)
  In some cases this requires emitting insns.  */
   assign_parms (subr);
 
+  /* The following was moved from init_function_start.
+ The move is supposed to make sdb output more accurate.  */
+  /* Indicate the beginning of the function body,
+ as opposed to parm setup.  */
+  rtx_note *b = emit_note (NOTE_INSN_FUNCTION_BEG);
+
+  gcc_assert (NOTE_P (get_last_insn ()));
+
   /* If function gets a static chain arg, store it.  */
   if (cfun->static_chain_decl)
 {
@@ -5284,15 +5292,7 @@ expand_function_start (tree subr)
   update_nonlocal_goto_save_area ();
 }
 
-  /* The following was moved from init_function_start.
- The move is supposed to make sdb output more accurate.  */
-  /* Indicate the beginning of the function body,
- as opposed to parm setup.  */
-  emit_note (NOTE_INSN_FUNCTION_BEG);
-
-  gcc_assert (NOTE_P (get_last_insn ()));
-
-  parm_birth_insn = get_last_insn ();
+  parm_birth_insn = b;
 
   if (crtl->profile)
 {
diff --git a/gcc/testsuite/gcc.dg/asan/pr81186.c b/gcc/testsuite/gcc.dg/asan/pr81186.c
new file mode 100644
index 000..7f0f672ca40
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/pr81186.c
@@ -0,0 +1,18 @@
+/* PR sanitizer/81186 */
+/* { dg-do run } */
+
+int
+main ()
+{
+  __label__ l;
+  void f ()
+  {
+int a[123];
+
+goto l;
+  }
+
+  f ();
+l:
+  return 0;
+}
-- 
2.13.2



[RFC][PATCH] Do refactoring of attribute functions and move them to attribs.[hc].

2017-07-13 Thread Martin Liška
Hi.

It's request for comment where I mechanically moved attribute-related function 
to attribs.[hc].

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Thoughts?
Martin
>From f4322151ebcd8cf71fd677317ef0a74374b0db5e Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 12 Jul 2017 13:39:54 +0200
Subject: [PATCH] Do refactoring of attribute functions and move them to
 attribs.[hc].

---
 gcc/asan.c|   2 +
 gcc/attribs.c | 633 ++
 gcc/attribs.h | 113 
 gcc/bb-reorder.c  |   2 +
 gcc/builtins.c|   2 +
 gcc/c-family/c-ada-spec.c |   2 +
 gcc/c-family/c-ubsan.c|   4 +-
 gcc/c-family/c-warn.c |   2 +
 gcc/c/c-convert.c |   2 +
 gcc/c/c-typeck.c  |   2 +
 gcc/calls.c   |   2 +
 gcc/cfgexpand.c   |   2 +
 gcc/cgraph.c  |   2 +
 gcc/cgraphunit.c  |   2 +
 gcc/convert.c |   2 +
 gcc/cp/call.c |   2 +
 gcc/cp/cp-gimplify.c  |   2 +
 gcc/cp/cp-ubsan.c |   2 +
 gcc/cp/cvt.c  |   2 +
 gcc/cp/init.c |   2 +
 gcc/cp/search.c   |   2 +
 gcc/cp/semantics.c|   2 +
 gcc/cp/typeck.c   |   2 +
 gcc/dwarf2out.c   |   2 +
 gcc/final.c   |   2 +
 gcc/fold-const.c  |   2 +
 gcc/fortran/trans-types.c |   1 +
 gcc/function.c|   2 +
 gcc/gimple-expr.c |   2 +
 gcc/gimple-fold.c |   2 +
 gcc/gimple-pretty-print.c |   2 +
 gcc/gimple.c  |   2 +
 gcc/gimplify.c|   2 +
 gcc/hsa-common.c  |   2 +
 gcc/hsa-gen.c |   2 +
 gcc/internal-fn.c |   2 +
 gcc/ipa-chkp.c|   2 +
 gcc/ipa-cp.c  |   2 +
 gcc/ipa-devirt.c  |   2 +
 gcc/ipa-fnsummary.c   |   2 +
 gcc/ipa-inline.c  |   2 +
 gcc/ipa-visibility.c  |   2 +
 gcc/ipa.c |   3 +-
 gcc/lto-cgraph.c  |   2 +
 gcc/lto/lto-lang.c|   2 +
 gcc/lto/lto-symtab.c  |   2 +
 gcc/omp-expand.c  |   3 +-
 gcc/omp-general.c |   3 +-
 gcc/omp-low.c |   2 +
 gcc/omp-offload.c |   2 +
 gcc/omp-simd-clone.c  |   3 +-
 gcc/opts-global.c |   2 +
 gcc/passes.c  |   2 +
 gcc/predict.c |   2 +
 gcc/sancov.c  |   2 +
 gcc/sanopt.c  |   2 +
 gcc/symtab.c  |   2 +
 gcc/toplev.c  |   2 +
 gcc/trans-mem.c   |   3 +-
 gcc/tree-chkp.c   |   2 +
 gcc/tree-eh.c |   2 +
 gcc/tree-into-ssa.c   |   2 +
 gcc/tree-object-size.c|   2 +
 gcc/tree-parloops.c   |   2 +
 gcc/tree-profile.c|   2 +
 gcc/tree-ssa-ccp.c|   2 +
 gcc/tree-ssa-live.c   |   2 +
 gcc/tree-ssa-loop.c   |   2 +
 gcc/tree-ssa-sccvn.c  |   2 +
 gcc/tree-ssa.c|   2 +
 gcc/tree-streamer-in.c|   2 +
 gcc/tree-vectorizer.c |   2 +
 gcc/tree-vrp.c|   2 +
 gcc/tree.c| 685 +-
 gcc/tree.h|  86 --
 gcc/tsan.c|   2 +
 gcc/ubsan.c   |   2 +
 gcc/varasm.c  |   2 +
 gcc/varpool.c |   2 +
 79 files changed, 898 insertions(+), 775 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 89c2731e8cd..23686358a08 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -47,6 +47,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "varasm.h"
 #include "stor-layout.h"
 #include "tree-iterator.h"
+#include "stringpool.h"
+#include "attribs.h"
 #include "asan.h"
 #include "dojump.h"
 #include "explow.h"
diff --git a/gcc/attribs.c b/gcc/attribs.c
index 5eb19e82795..a1e52653edd 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -925,3 +925,636 @@ is_function_default_version (const tree decl)
   return (TREE_CODE (attr) == STRING_CST
 	  && strcmp (TREE_STRING_POINTER (attr), "default") == 0);
 }
+
+/* Return a declaration like DDECL except that its DECL_ATTRIBUTES
+   is ATTRIBUTE.  */
+
+tree
+build_decl_attribute_variant (tree ddecl, tree attribute)
+{
+  DECL_ATTRIBUTES (ddecl) = attribute;
+  return ddecl;
+}
+
+/* Return a type like TTYPE except that its TYPE_ATTRIBUTE
+   is ATTRIBUTE and its qualifiers are QUALS.
+
+   Record such modified types already made so we don't make duplicates.  */
+
+tree
+build_type_attribute_qual_variant (tree ttype, tree attribute, int quals)
+{
+  if (! attribute_list_equal (TYPE_ATTRIBUTES (ttype), attribute))
+{
+  tree ntype;
+
+  /* Building a distinct copy of a tagged type is inappropriate; it
+	 causes breakage in code that expects there to be a one-to-one
+	 relationship between a struct and its fields.
+	 build_duplicate_type is another solution (as used in
+	 handle_transparent_union_attribute), but that doesn't play well
+	 with the stronger C++ type identity model.  */
+  if (TREE_CODE (ttype) == 

Re: [PATCH v2][RFC] Canonize names of attributes.

2017-07-13 Thread Martin Liška
On 07/11/2017 05:52 PM, Jason Merrill wrote:
> On Tue, Jul 11, 2017 at 9:37 AM, Martin Liška  wrote:
>> On 07/03/2017 11:00 PM, Jason Merrill wrote:
>>> On Mon, Jul 3, 2017 at 5:52 AM, Martin Liška  wrote:
 On 06/30/2017 09:34 PM, Jason Merrill wrote:
>
> On Fri, Jun 30, 2017 at 5:23 AM, Martin Liška  wrote:
>>
>> This is v2 of the patch, where just names of attributes are
>> canonicalized.
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>> tests.
>
>
> What is the purpose of the new "strict" parameter to cmp_attribs* ?  I
> don't see any discussion of it.


 It's needed for arguments of attribute names, like:

 /usr/include/stdio.h:391:62: internal compiler error: in cmp_attribs, at
 tree.h:5523
__THROWNL __attribute__ ((__format__ (__printf__, 3, 4)));

>>>
>>> Mm.  Although we don't want to automatically canonicalize all
>>> identifier arguments to attributes in the parser, we could still do it
>>> for specific attributes, e.g. in handle_format_attribute or
>>> handle_mode_attribute.
>>
>> Yep, that was done in my previous version of the patch
>> (https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00996.html).
>> Where only attribute that was preserved unchanged was 'cleanup':
>>
>> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
>> index 8f638785e0e..08b4db5e5bd 100644
>> --- a/gcc/cp/parser.c
>> +++ b/gcc/cp/parser.c
>> @@ -24765,7 +24765,8 @@ cp_parser_gnu_attribute_list (cp_parser* parser)
>>   tree tv;
>>   if (arguments != NULL_TREE
>>   && ((tv = TREE_VALUE (arguments)) != NULL_TREE)
>> - && TREE_CODE (tv) == IDENTIFIER_NODE)
>> + && TREE_CODE (tv) == IDENTIFIER_NODE
>> + && !id_equal (TREE_PURPOSE (attribute), "cleanup"))
>> TREE_VALUE (arguments) = canonize_attr_name (tv);
>>   release_tree_vector (vec);
>> }
>>
>> Does it work for you to do it so?
> 
> This is canonicalizing arguments by default; I want the default to be
> not canonicalizing arguments.  I think we only want to canonicalize
> arguments for format and mode, and we can do that in their handle_*
> functions.

Yep, done that in v3. I decided to move couple of functions to attribs.h and
verified that it will not cause binary size increase of cc1 and cc1plus.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

> 
> Jason
> 

>From bd7bf58da6f0688fa10d3dfc72328f9313d7439c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 8 Jun 2017 10:23:25 +0200
Subject: [PATCH 1/2] Canonicalize names of attributes.

gcc/ChangeLog:

2017-07-12  Martin Liska  

	* attribs.h (canonicalize_attr_name): New function.
	(cmp_attribs): Move from c-format.c and adjusted.
	(is_attribute_p): Moved from tree.h.
	* tree-inline.c: Add new includes.
	* tree.c (cmp_attrib_identifiers): Use cmp_attribs.
	(private_is_attribute_p): Remove.
	(private_lookup_attribute): Likewise.
	(private_lookup_attribute_by_prefix): Simplify.
	(remove_attribute): Use is_attribute_p.
	* tree.h: Remove removed declarations.

gcc/c-family/ChangeLog:

2017-07-12  Martin Liska  

	* array-notation-common.c: Add new includes.
	* c-format.c( handle_format_attribute): Canonicalize a format
	function name.
	* c-lex.c (c_common_has_attribute): Canonicalize name of an
	attribute.
	* c-pretty-print.c: Add new include.

gcc/cp/ChangeLog:

2017-07-12  Martin Liska  

	* parser.c (cp_parser_gnu_attribute_list): Canonicalize name of an
	attribute.
	(cp_parser_std_attribute): Likewise.
	* tree.c: Add new include.

gcc/c/ChangeLog:

2017-07-12  Martin Liska  

	* c-parser.c (c_parser_attributes): Canonicalize name of an
	attribute.
gcc/go/ChangeLog:

2017-06-29  Martin Liska  

	* go-gcc.cc (Gcc_backend::function): Look up for no_split_stack
	and not __no_split_stack__.

gcc/testsuite/ChangeLog:

2017-06-29  Martin Liska  

	* g++.dg/cpp0x/pr65558.C: Update scanned pattern.
	* gcc.dg/parm-impl-decl-1.c: Likewise.
	* gcc.dg/parm-impl-decl-3.c: Likewise.
---
 gcc/attribs.h   |  49 +++
 gcc/c-family/array-notation-common.c|   2 +
 gcc/c-family/c-format.c |  24 ++--
 gcc/c-family/c-lex.c|   1 +
 gcc/c-family/c-pretty-print.c   |   1 +
 gcc/c/c-parser.c|   3 +
 gcc/cp/parser.c |   6 +-
 gcc/cp/tree.c   |   1 +
 gcc/go/go-gcc.cc|   2 +-
 gcc/testsuite/g++.dg/cpp0x/pr65558.C|   2 +-
 gcc/testsuite/gcc.dg/parm-impl-decl-1.c |   2 +-
 gcc/testsuite/gcc.dg/parm-impl-decl-3.c |   2 +-
 gcc/tree-inline.c   |   3 +-
 gcc/tree.c 

Re: [PATCH] Cleanup #2 of Pascal references.

2017-07-13 Thread Jason Merrill
OK.

On Thu, Jul 13, 2017 at 4:10 AM, Martin Liška  wrote:
> Thanks Jason, I'm sending patch #2.
>
> Ready for trunk?
> Martin


RE: [PATCH 6/7] [ARC] Deprecate mexpand-adddi option.

2017-07-13 Thread Claudiu Zissulescu
> 
> This looks fine, though the commit message tells me it's not a good
> idea, but it would be nice to know _why_ it's not good.  Might be nice
> to know for future reference.
> 
Again LRA, expand time subregs are not handled by LRA, as far as I can tell. 
Moreover, this option is by default off, and even using the old IRA may not be 
a good idea to have subregs before register allocation step.

> Also, there's no test.  Was there an issue that revealed this as not a
> good idea?  Could that become a test?
> 

This error was found while running dg for our port with LRA on and this option 
on. Once removed the mexpand-adddi, a test is impossible to get.

Thanks,
Claudiu


Re: [PATCH 7/7] [ARC] Consolidate PIC implementation.

2017-07-13 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-06-01 15:34:57 
+0200]:

> This patch refactors a number of functions and compiler hooks into using a
> single function which checks if a rtx is suited for pic or not. Removed
> functions are arc_legitimate_pc_offset_p and arc_legitimate_pic_operand_p
> beeing replaced by calls to arc_legitimate_pic_addr_p. Thus we have an
> unitary way of checking a rtx beeing pic.
> 
> gcc/
> 2017-02-24  Claudiu Zissulescu  
> 
>   * config/arc/arc-protos.h (arc_legitimate_pc_offset_p): Remove
>   proto.
>   (arc_legitimate_pic_operand_p): Likewise.
>   * config/arc/arc.c (arc_legitimate_pic_operand_p): Remove
>   function.
>   (arc_needs_pcl_p): Likewise.
>   (arc_legitimate_pc_offset_p): Likewise.
>   (arc_legitimate_pic_addr_p): Remove LABEL_REF case, as this
>   function is also used in constrains.md.
>   (arc_legitimate_constant_p): Use arc_legitimate_pic_addr_p to
>   validate pic constants. Handle CONST_INT, CONST_DOUBLE, MINUS and
>   PLUS.  Only return true/false in known cases, otherwise assert.
>   (arc_legitimate_address_p): Remove arc_legitimate_pic_addr_p as it
>   is already called in arc_legitimate_constant_p.
>   * config/arc/arc.h (CONSTANT_ADDRESS_P): Consider also LABEL for
>   pic addresses.
>   (LEGITIMATE_PIC_OPERAND_P): Use
>   arc_raw_symbolic_reference_mentioned_p function.
>   * config/arc/constraints.md (Cpc): Use arc_legitimate_pic_addr_p
>   function.
>   (Cal): Likewise.
>   (C32): Likewise.
> 
> gcc/testsuite
> 2017-02-24  Claudiu Zissulescu  
> 
>   * gcc.target/arc/pr9000674901.c: New file.
>   * gcc.target/arc/pic-1.c: Likewise.
>   * gcc.target/arc/pr9001191897.c: Likewise.


Looks like a good clean up.

Thanks,
Andrew


> ---
>  gcc/config/arc/arc-protos.h |   2 -
>  gcc/config/arc/arc.c| 150 
> +---
>  gcc/config/arc/arc.h|  11 +-
>  gcc/config/arc/constraints.md   |   6 +-
>  gcc/testsuite/gcc.target/arc/pic-1.c|  11 ++
>  gcc/testsuite/gcc.target/arc/pr9000674901.c |  58 +++
>  gcc/testsuite/gcc.target/arc/pr9001191897.c |  10 ++
>  7 files changed, 136 insertions(+), 112 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/pic-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/pr9000674901.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/pr9001191897.c
> 
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index b436dbe..850795a 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -60,10 +60,8 @@ extern rtx arc_return_addr_rtx (int , rtx);
>  extern bool check_if_valid_regno_const (rtx *, int);
>  extern bool check_if_valid_sleep_operand (rtx *, int);
>  extern bool arc_legitimate_constant_p (machine_mode, rtx);
> -extern bool arc_legitimate_pc_offset_p (rtx);
>  extern bool arc_legitimate_pic_addr_p (rtx);
>  extern bool arc_raw_symbolic_reference_mentioned_p (rtx, bool);
> -extern bool arc_legitimate_pic_operand_p (rtx);
>  extern bool arc_is_longcall_p (rtx);
>  extern bool arc_is_shortcall_p (rtx);
>  extern bool valid_brcc_with_delay_p (rtx *);
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 7dfc68e..89de6cd 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -249,7 +249,6 @@ static rtx arc_expand_builtin (tree, rtx, rtx, 
> machine_mode, int);
>  static int branch_dest (rtx);
>  
>  static void  arc_output_pic_addr_const (FILE *,  rtx, int);
> -bool arc_legitimate_pic_operand_p (rtx);
>  static bool arc_function_ok_for_sibcall (tree, tree);
>  static rtx arc_function_value (const_tree, const_tree, bool);
>  const char * output_shift (rtx *);
> @@ -5152,57 +5151,6 @@ arc_rtx_costs (rtx x, machine_mode mode, int 
> outer_code,
>  }
>  }
>  
> -/* Helper used by arc_legitimate_pc_offset_p.  */
> -
> -static bool
> -arc_needs_pcl_p (rtx x)
> -{
> -  register const char *fmt;
> -  register int i, j;
> -
> -  if ((GET_CODE (x) == UNSPEC)
> -  && (XVECLEN (x, 0) == 1)
> -  && (GET_CODE (XVECEXP (x, 0, 0)) == SYMBOL_REF))
> -switch (XINT (x, 1))
> -  {
> -  case ARC_UNSPEC_GOT:
> -  case ARC_UNSPEC_GOTOFFPC:
> -  case UNSPEC_TLS_GD:
> -  case UNSPEC_TLS_IE:
> - return true;
> -  default:
> - break;
> -  }
> -
> -  fmt = GET_RTX_FORMAT (GET_CODE (x));
> -  for (i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; i--)
> -{
> -  if (fmt[i] == 'e')
> - {
> -   if (arc_needs_pcl_p (XEXP (x, i)))
> - return true;
> - }
> -  else if (fmt[i] == 'E')
> - for (j = XVECLEN (x, i) - 1; j >= 0; j--)
> -   if (arc_needs_pcl_p (XVECEXP (x, i, j)))
> - return true;
> -}
> -
> -  return false;
> -}
> -
> -/* Return true if ADDR is an address that needs to be expressed as an
> -  

Re: [PATCH 6/7] [ARC] Deprecate mexpand-adddi option.

2017-07-13 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-06-01 15:34:56 
+0200]:

> From: claziss 
> 
> Emitting subregs in the expand is not a good idea. Deprecate this
> option.
> 
> gcc/
> 2017-04-26  Claudiu Zissulescu  
> 
>   * config/arc/arc.md (adddi3): Remove support for mexpand-adddi
>   option.
>   (subdi3): Likewise.
>   * config/arc/arc.opt (mexpand-adddi): Deprecate it.
>   * doc/invoke.texi (mexpand-adddi): Update text.

This looks fine, though the commit message tells me it's not a good
idea, but it would be nice to know _why_ it's not good.  Might be nice
to know for future reference.

Also, there's no test.  Was there an issue that revealed this as not a
good idea?  Could that become a test?

Thanks,
Andrew


> ---
>  gcc/config/arc/arc.md  | 39 +--
>  gcc/config/arc/arc.opt |  2 +-
>  gcc/doc/invoke.texi|  2 +-
>  3 files changed, 3 insertions(+), 40 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 928feb1..f595da7 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -2649,30 +2649,7 @@
>   (match_operand:DI 2 "nonmemory_operand" "")))
> (clobber (reg:CC CC_REG))])]
>""
> -{
> -  if (TARGET_EXPAND_ADDDI)
> -{
> -  rtx l0 = gen_lowpart (SImode, operands[0]);
> -  rtx h0 = disi_highpart (operands[0]);
> -  rtx l1 = gen_lowpart (SImode, operands[1]);
> -  rtx h1 = disi_highpart (operands[1]);
> -  rtx l2 = gen_lowpart (SImode, operands[2]);
> -  rtx h2 = disi_highpart (operands[2]);
> -  rtx cc_c = gen_rtx_REG (CC_Cmode, CC_REG);
> -
> -  if (CONST_INT_P (h2) && INTVAL (h2) < 0 && SIGNED_INT12 (INTVAL (h2)))
> - {
> -   emit_insn (gen_sub_f (l0, l1, gen_int_mode (-INTVAL (l2), SImode)));
> -   emit_insn (gen_sbc (h0, h1,
> -   gen_int_mode (-INTVAL (h2) - (l1 != 0), SImode),
> -   cc_c));
> -   DONE;
> - }
> -  emit_insn (gen_add_f (l0, l1, l2));
> -  emit_insn (gen_adc (h0, h1, h2));
> -  DONE;
> -}
> -})
> +{})
>  
>  ; This assumes that there can be no strictly partial overlap between
>  ; operands[1] and operands[2].
> @@ -2911,20 +2888,6 @@
>  {
>if (!register_operand (operands[2], DImode))
>  operands[1] = force_reg (DImode, operands[1]);
> -  if (TARGET_EXPAND_ADDDI)
> -{
> -  rtx l0 = gen_lowpart (SImode, operands[0]);
> -  rtx h0 = disi_highpart (operands[0]);
> -  rtx l1 = gen_lowpart (SImode, operands[1]);
> -  rtx h1 = disi_highpart (operands[1]);
> -  rtx l2 = gen_lowpart (SImode, operands[2]);
> -  rtx h2 = disi_highpart (operands[2]);
> -  rtx cc_c = gen_rtx_REG (CC_Cmode, CC_REG);
> -
> -  emit_insn (gen_sub_f (l0, l1, l2));
> -  emit_insn (gen_sbc (h0, h1, h2, cc_c));
> -  DONE;
> -}
>  })
>  
>  (define_insn_and_split "subdi3_i"
> diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
> index ed2b827..ad2df26 100644
> --- a/gcc/config/arc/arc.opt
> +++ b/gcc/config/arc/arc.opt
> @@ -328,7 +328,7 @@ Target Var(TARGET_Q_CLASS)
>  Enable 'q' instruction alternatives.
>  
>  mexpand-adddi
> -Target Var(TARGET_EXPAND_ADDDI)
> +Target Warn(%qs is deprecated)
>  Expand adddi3 and subdi3 at rtl generation time into add.f / adc etc.
>  
>  
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 59563aa..b6cf4ce 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14823,7 +14823,7 @@ Enable pre-reload use of the @code{cbranchsi} pattern.
>  @item -mexpand-adddi
>  @opindex mexpand-adddi
>  Expand @code{adddi3} and @code{subdi3} at RTL generation time into
> -@code{add.f}, @code{adc} etc.
> +@code{add.f}, @code{adc} etc.  This option is deprecated.
>  
>  @item -mindexed-loads
>  @opindex mindexed-loads
> -- 
> 1.9.1
> 


Re: Introduce Statement Frontier Notes and Location Views

2017-07-13 Thread Alexandre Oliva
On Jul  5, 2017, Alexandre Oliva  wrote:

> This patch implements statement frontier notes and location views,
> concepts originally proposed in the GCC Summit back in 2010.  See
> https://people.redhat.com/aoliva/papers/sfn/ for details on the
> original design.

There's a newer blog post about these features that provides further
context and motivation.
https://developers.redhat.com/blog/2017/07/11/statement-frontier-notes-and-location-views/#more-437095


I wonder if it would be useful to break up the patch into smaller
pieces, for purposes of review.  The changes are mostly interdependent,
though it is possible to break it up into major features, say one patch
introducing statement frontier notes, one or more patches fixing new
-fcompare-debug failures, and one patch introducing location views.  The
changes are largely split up like that in the aoliva/SFN branch, though
I don't think it would be appropriate to install these changes as such a
sequence of patches, though, because I don't think it's adequate to have
known regressions, even if temporarily, and the initial SFN patch,
without the subsequent -fcompare-debug fixes, would do just that.

Thoughts?  Advice?

Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


RE: [PATCH 5/7] [ARC] Enable indexed loads for elf targers.

2017-07-13 Thread Claudiu Zissulescu
> The change looks fine, but it would be nice if the commit message
> explained _why_ we are default off for Linux and on for Elf, I think
> more text in the commit message on this sort of thing will help future
> developers understand why things are the way they are.
> 

This explanation is quite simple,  we haven't fully validated Linux with those 
options on. As it may take a while until I am getting a more complete image on 
this enhancement for Linux, I've enabled this option for elf, where I've got 
very good feedback from our gnu users.

Normally, it should be no problem to enable them for Linux as well, but I would 
like to have more testing data on this subject from various gnu users.

Thanks,
Claudiu


C++ PATCH to C++17 class deduction from init-list

2017-07-13 Thread Jason Merrill
P0512 corrected the specification of C++17 class template argument
deduction to work more like constructor overload resolution in
initialization; in particular, this means that we should do the same
two-phase overload resolution for a class with a list constructor.
This patch implements that, so that e.g.

  #include 
  int a[1];
  std::vector v{a, a, std::allocator()};

will now work.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 7cede2ad071dbd7ab1f174465305e5568b5d311a
Author: Jason Merrill 
Date:   Wed Jul 12 12:53:23 2017 -0400

P0512R0 - Deduction from an initializer list.

* pt.c (do_class_deduction): Do list deduction in two phases.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index bd02951..0df6854 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25329,14 +25329,20 @@ do_class_deduction (tree ptype, tree tmpl, tree init, 
int flags,
 
   tree type = TREE_TYPE (tmpl);
 
+  bool try_list_ctor = false;
+
   vec *args;
   if (init == NULL_TREE
   || TREE_CODE (init) == TREE_LIST)
 args = make_tree_vector_from_list (init);
-  else if (BRACE_ENCLOSED_INITIALIZER_P (init)
-  && !TYPE_HAS_LIST_CTOR (type)
-  && !is_std_init_list (type))
-args = make_tree_vector_from_ctor (init);
+  else if (BRACE_ENCLOSED_INITIALIZER_P (init))
+{
+  try_list_ctor = TYPE_HAS_LIST_CTOR (type);
+  if (try_list_ctor || is_std_init_list (type))
+   args = make_tree_vector_single (init);
+  else
+   args = make_tree_vector_from_ctor (init);
+}
   else
 args = make_tree_vector_single (init);
 
@@ -25391,13 +25397,43 @@ do_class_deduction (tree ptype, tree tmpl, tree init, 
int flags,
saw_ctor = true;
   }
 
-  if (args->length () < 2)
+  tree call = error_mark_node;
+
+  /* If this is list-initialization and the class has a list constructor, first
+ try deducing from the list as a single argument, as [over.match.list].  */
+  tree list_cands = NULL_TREE;
+  if (try_list_ctor && cands)
+for (lkp_iterator iter (cands); iter; ++iter)
+  {
+   tree dg = *iter;
+   if (is_list_ctor (dg))
+ list_cands = lookup_add (dg, list_cands);
+  }
+  if (list_cands)
+{
+  ++cp_unevaluated_operand;
+  call = build_new_function_call (list_cands, , tf_decltype);
+  --cp_unevaluated_operand;
+
+  if (call == error_mark_node)
+   {
+ /* That didn't work, now try treating the list as a sequence of
+arguments.  */
+ release_tree_vector (args);
+ args = make_tree_vector_from_ctor (init);
+   }
+}
+
+  /* Maybe generate an implicit deduction guide.  */
+  if (call == error_mark_node && args->length () < 2)
 {
   tree gtype = NULL_TREE;
 
   if (args->length () == 1)
+   /* Generate a copy guide.  */
gtype = build_reference_type (type);
   else if (!saw_ctor)
+   /* Generate a default guide.  */
gtype = type;
 
   if (gtype)
@@ -25419,22 +25455,29 @@ do_class_deduction (tree ptype, tree tmpl, tree init, 
int flags,
   return error_mark_node;
 }
 
-  ++cp_unevaluated_operand;
-  tree t = build_new_function_call (cands, , tf_decltype);
+  if (call == error_mark_node)
+{
+  ++cp_unevaluated_operand;
+  call = build_new_function_call (cands, , tf_decltype);
+  --cp_unevaluated_operand;
+}
 
-  if (t == error_mark_node && (complain & tf_warning_or_error))
+  if (call == error_mark_node && (complain & tf_warning_or_error))
 {
   error ("class template argument deduction failed:");
-  t = build_new_function_call (cands, , complain | tf_decltype);
+
+  ++cp_unevaluated_operand;
+  call = build_new_function_call (cands, , complain | tf_decltype);
+  --cp_unevaluated_operand;
+
   if (elided)
inform (input_location, "explicit deduction guides not considered "
"for copy-initialization");
 }
 
-  --cp_unevaluated_operand;
   release_tree_vector (args);
 
-  return cp_build_qualified_type (TREE_TYPE (t), cp_type_quals (ptype));
+  return cp_build_qualified_type (TREE_TYPE (call), cp_type_quals (ptype));
 }
 
 /* Replace occurrences of 'auto' in TYPE with the appropriate type deduced
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction41.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction41.C
new file mode 100644
index 000..5e7fa3a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction41.C
@@ -0,0 +1,19 @@
+// { dg-options -std=c++1z }
+
+#include 
+
+struct B { };
+
+template 
+struct A
+{
+  A(std::initializer_list);
+  A(T, B);
+};
+
+A a { 1, B() };
+
+template  struct same;
+template  struct same { };
+
+same s;


RE: [PATCH 4/7] [ARC] [LRA] Avoid emitting COND_EXEC during expand.

2017-07-13 Thread Claudiu Zissulescu
> This seems fine, your description "does not always work." is a bit
> of a tease :) it would be nice to know _why_ it doesn't work, or at
> least a description of what problem you're seeing.
> 
As far as I can see, LRA doesn't handle very well the conditional execution 
patterns, as it expects conditional execution to happen after this step. Thus, 
some of those instructions are marked dead and removed later on.

> Also we seem to be missing a test, would it be possible to find one?
> If not then I guess we live without, but we should note that in the
> commit message.

This error is found by executing dg.exp testsuite with our port and -mlra 
option on.  As we speak, I am on the last 100m of testing our port having the 
LRA on. This bug being found like that.

I'll add this discussion to the commit message body.

Thank you,
Claudiu


Re: [PATCH] Add quotes to error messages related to Sanitizers.

2017-07-13 Thread Martin Liška
On 07/11/2017 07:45 PM, David Malcolm wrote:
> On Mon, 2017-07-10 at 11:36 +0200, Martin Liška wrote:
>> Hi.
>>
>> This adds missing quotes to various error messages related to
>> AddressSanitizer.
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>> tests.
>>
>> Ready to be installed?
> 
> LGTM, with my "diagnostic messages" maintainer hat on.

Patch has been just installed.

> 
> Grepping for "-f" within opts.c shows a few other diagnostics there
> that could use quotes, but that's not a reason not to go ahead with
> this patch.

Yep, I'm testing another patch.

Martin

> 
> Thanks
> Dave
> 
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2017-07-04  Martin Liska  
>>
>>  * opts.c (finish_options): Add quotes to error messages.
>>  (parse_sanitizer_options): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-07-04  Martin Liska  
>>
>>  * c-c++-common/ubsan/sanitize-all-1.c: Update scanned pattern.
>>  * c-c++-common/ubsan/sanitize-recover-1.c:Likewise.
>>  * c-c++-common/ubsan/sanitize-recover-2.c:Likewise.
>>  * c-c++-common/ubsan/sanitize-recover-5.c:Likewise.
>>  * c-c++-common/ubsan/sanitize-recover-7.c:Likewise.
>>  * c-c++-common/ubsan/sanitize-recover-8.c:Likewise.
>>  * c-c++-common/ubsan/sanitize-recover-9.c:Likewise.
>> ---
>>  gcc/opts.c| 18 +
>> -
>>  gcc/testsuite/c-c++-common/ubsan/sanitize-all-1.c |  2 +-
>>  gcc/testsuite/c-c++-common/ubsan/sanitize-recover-1.c |  2 +-
>>  gcc/testsuite/c-c++-common/ubsan/sanitize-recover-2.c |  2 +-
>>  gcc/testsuite/c-c++-common/ubsan/sanitize-recover-5.c |  2 +-
>>  gcc/testsuite/c-c++-common/ubsan/sanitize-recover-7.c |  2 +-
>>  gcc/testsuite/c-c++-common/ubsan/sanitize-recover-8.c |  2 +-
>>  gcc/testsuite/c-c++-common/ubsan/sanitize-recover-9.c |  2 +-
>>  8 files changed, 16 insertions(+), 16 deletions(-)
>>
>>



[PATCH] PR c++/80287 add new testcase

2017-07-13 Thread Yvan Roux
Hi,

as discussed in the PR, this patch adds a new reduced testcase which
doesn't rely on c++17 features, this is a prereq to the backport of
the fix into GCC 6 branch which is impacted by this issue.

Validated on x86, ARM and AArch64 targets.

Ok for trunk ? and maybe on gcc-7-branch ?

Thanks,
Yvan

gcc/testsuite
2017-07-13  Yvan Roux  

PR c++/80287
* g++.dg/pr80287.C: New test.
diff --git a/gcc/testsuite/g++.dg/pr80287.C b/gcc/testsuite/g++.dg/pr80287.C
new file mode 100644
index 000..da8d3fab150
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr80287.C
@@ -0,0 +1,13 @@
+// PR c++/80287
+// { dg-do compile { target c++11 } }
+// { dg-options "-g" }
+
+struct A {
+  operator long() {}
+} __attribute__((__may_alias__));
+
+struct {
+  A ino;
+} a;
+
+char b = a.ino;


Re: [PATCH 5/7] [ARC] Enable indexed loads for elf targers.

2017-07-13 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-06-01 15:34:55 
+0200]:

> gcc/
> 2017-02-28  Claudiu Zissulescu  
> 
>   * config/arc/arc.opt (mindexed-loads): Use initial value
>   TARGET_INDEXED_LOADS_DEFAULT.
>   (mauto-modify-reg): Use initial value
>   TARGET_AUTO_MODIFY_REG_DEFAULT.
>   * config/arc/elf.h (TARGET_INDEXED_LOADS_DEFAULT): Define.
>   (TARGET_AUTO_MODIFY_REG_DEFAULT): Likewise.
>   * config/arc/linux.h (TARGET_INDEXED_LOADS_DEFAULT): Define.
>   (TARGET_AUTO_MODIFY_REG_DEFAULT): Likewise.

The change looks fine, but it would be nice if the commit message
explained _why_ we are default off for Linux and on for Elf, I think
more text in the commit message on this sort of thing will help future
developers understand why things are the way they are.

Thanks,
Andrew



> ---
>  gcc/config/arc/arc.opt | 4 ++--
>  gcc/config/arc/elf.h   | 8 
>  gcc/config/arc/linux.h | 8 
>  3 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
> index f01a2ff..ed2b827 100644
> --- a/gcc/config/arc/arc.opt
> +++ b/gcc/config/arc/arc.opt
> @@ -270,11 +270,11 @@ Target RejectNegative Var(arc_tune, 
> TUNE_ARC700_4_2_XMAC)
>  Tune for ARC700 R4.2 Cpu with XMAC block.
>  
>  mindexed-loads
> -Target Var(TARGET_INDEXED_LOADS)
> +Target Var(TARGET_INDEXED_LOADS) Init(TARGET_INDEXED_LOADS_DEFAULT)
>  Enable the use of indexed loads.
>  
>  mauto-modify-reg
> -Target Var(TARGET_AUTO_MODIFY_REG)
> +Target Var(TARGET_AUTO_MODIFY_REG) Init(TARGET_AUTO_MODIFY_REG_DEFAULT)
>  Enable the use of pre/post modify with register displacement.
>  
>  mmul32x16
> diff --git a/gcc/config/arc/elf.h b/gcc/config/arc/elf.h
> index c5794f8..43f3408 100644
> --- a/gcc/config/arc/elf.h
> +++ b/gcc/config/arc/elf.h
> @@ -58,3 +58,11 @@ along with GCC; see the file COPYING3.  If not see
>  /* Bare-metal toolchains do not need a thread pointer register.  */
>  #undef TARGET_ARC_TP_REGNO_DEFAULT
>  #define TARGET_ARC_TP_REGNO_DEFAULT -1
> +
> +/* Indexed loads are default.  */
> +#undef TARGET_INDEXED_LOADS_DEFAULT
> +#define TARGET_INDEXED_LOADS_DEFAULT 1
> +
> +/* Pre/post modify with register displacement are default.  */
> +#undef TARGET_AUTO_MODIFY_REG_DEFAULT
> +#define TARGET_AUTO_MODIFY_REG_DEFAULT 1
> diff --git a/gcc/config/arc/linux.h b/gcc/config/arc/linux.h
> index 83e5a1d..d8e0063 100644
> --- a/gcc/config/arc/linux.h
> +++ b/gcc/config/arc/linux.h
> @@ -83,3 +83,11 @@ along with GCC; see the file COPYING3.  If not see
>  #define SUBTARGET_CPP_SPEC "\
> %{pthread:-D_REENTRANT} \
>  "
> +
> +/* Indexed loads are default off.  */
> +#undef TARGET_INDEXED_LOADS_DEFAULT
> +#define TARGET_INDEXED_LOADS_DEFAULT 0
> +
> +/* Pre/post modify with register displacement are default off.  */
> +#undef TARGET_AUTO_MODIFY_REG_DEFAULT
> +#define TARGET_AUTO_MODIFY_REG_DEFAULT 0
> -- 
> 1.9.1
> 


Re: [PATCH 4/7] [ARC] [LRA] Avoid emitting COND_EXEC during expand.

2017-07-13 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-06-01 15:34:54 
+0200]:

> Emmitting COND_EXEC rtxes during expand does not always work.
> 
> gcc/
> 2017-01-10  Claudiu Zissulescu  
> 
>   * config/arc/arc.md (clzsi2): Expand to an arc_clzsi2 instruction
>   that also clobbers the CC register. The old expand code is moved
>   to ...
>   (*arc_clzsi2): ... here.
>   (ctzsi2): Expand to an arc_ctzsi2 instruction that also clobbers
>   the CC register. The old expand code is moved to ...
>   (arc_ctzsi2): ... here.

This seems fine, your description "does not always work." is a bit
of a tease :) it would be nice to know _why_ it doesn't work, or at
least a description of what problem you're seeing.

Also we seem to be missing a test, would it be possible to find one?
If not then I guess we live without, but we should note that in the
commit message.

Thanks,
Andrew




> ---
>  gcc/config/arc/arc.md | 41 ++---
>  1 file changed, 34 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 39bcc26..928feb1 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -4533,9 +4533,21 @@
> (set_attr "type" "two_cycle_core,two_cycle_core")])
>  
>  (define_expand "clzsi2"
> -  [(set (match_operand:SI 0 "dest_reg_operand" "")
> - (clz:SI (match_operand:SI 1 "register_operand" "")))]
> +  [(parallel
> +[(set (match_operand:SI 0 "register_operand" "")
> +   (clz:SI (match_operand:SI 1 "register_operand" "")))
> + (clobber (match_dup 2))])]
> +  "TARGET_NORM"
> +  "operands[2] = gen_rtx_REG (CC_ZNmode, CC_REG);")
> +
> +(define_insn_and_split "*arc_clzsi2"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> + (clz:SI (match_operand:SI 1 "register_operand" "r")))
> +   (clobber (reg:CC_ZN CC_REG))]
>"TARGET_NORM"
> +  "#"
> +  "reload_completed"
> +  [(const_int 0)]
>  {
>emit_insn (gen_norm_f (operands[0], operands[1]));
>emit_insn
> @@ -4552,9 +4564,23 @@
>  })
>  
>  (define_expand "ctzsi2"
> -  [(set (match_operand:SI 0 "register_operand" "")
> - (ctz:SI (match_operand:SI 1 "register_operand" "")))]
> +  [(match_operand:SI 0 "register_operand" "")
> +   (match_operand:SI 1 "register_operand" "")]
>"TARGET_NORM"
> +  "
> +  emit_insn (gen_arc_ctzsi2 (operands[0], operands[1]));
> +  DONE;
> +")
> +
> +(define_insn_and_split "arc_ctzsi2"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> + (ctz:SI (match_operand:SI 1 "register_operand" "r")))
> +   (clobber (reg:CC_ZN CC_REG))
> +   (clobber (match_scratch:SI 2 "="))]
> +  "TARGET_NORM"
> +  "#"
> +  "reload_completed"
> +  [(const_int 0)]
>  {
>rtx temp = operands[0];
>  
> @@ -4562,10 +4588,10 @@
>|| (REGNO (temp) < FIRST_PSEUDO_REGISTER
> && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS],
>REGNO (temp
> -temp = gen_reg_rtx (SImode);
> +temp = operands[2];
>emit_insn (gen_addsi3 (temp, operands[1], constm1_rtx));
>emit_insn (gen_bic_f_zn (temp, temp, operands[1]));
> -  emit_insn (gen_clrsbsi2 (temp, temp));
> +  emit_insn (gen_clrsbsi2 (operands[0], temp));
>emit_insn
>  (gen_rtx_COND_EXEC
>(VOIDmode,
> @@ -4575,7 +4601,8 @@
>  (gen_rtx_COND_EXEC
>(VOIDmode,
> gen_rtx_GE (VOIDmode, gen_rtx_REG (CC_ZNmode, CC_REG), const0_rtx),
> -   gen_rtx_SET (operands[0], gen_rtx_MINUS (SImode, GEN_INT (31), 
> temp;
> +   gen_rtx_SET (operands[0], gen_rtx_MINUS (SImode, GEN_INT (31),
> + operands[0];
>DONE;
>  })
>  
> -- 
> 1.9.1
> 


Re: [PATCH][ARC] Add support for naked functions.

2017-07-13 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-06-19 11:52:31 
+0200]:

> From: claziss 
> 
> Hi Andrew,
> 
> Apologizes for the disconfort, please find the patch that works on the head.
> 
> Thanks,
> Claudiu
> 
> gcc/
> 2016-12-13  Claudiu Zissulescu  
> Andrew Burgess  
> 
> * config/arc/arc-protos.h (arc_compute_function_type): Change 
> prototype.
> (arc_return_address_register): New function.
> * config/arc/arc.c (arc_handle_fndecl_attribute): New function.
> (arc_handle_fndecl_attribute): Add naked attribute.
> (TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS): Define.
> (TARGET_WARN_FUNC_RETURN): Likewise.
> (arc_allocate_stack_slots_for_args): New function.
> (arc_warn_func_return): Likewise.
> (machine_function): Change type fn_type.
> (arc_compute_function_type): Consider new naked function type,
> change function return type.
> (arc_must_save_register): Adapt to handle new
> arc_compute_function_type's return type.
> (arc_expand_prologue): Likewise.
> (arc_expand_epilogue): Likewise.
> (arc_return_address_regs): Delete.
> (arc_return_address_register): New function.
> (arc_epilogue_uses): Use above function.
> * config/arc/arc.h (arc_return_address_regs): Delete prototype.
> (arc_function_type): Change encoding, add naked type.
> (ARC_INTERRUPT_P): Change to handle the new encoding.
> (ARC_FAST_INTERRUPT_P): Likewise.
> (ARC_NORMAL_P): Define.
> (ARC_NAKED_P): Likewise.
> (arc_compute_function_type): Delete prototype.
> * config/arc/arc.md (in_ret_delay_slot): Use
> arc_return_address_register function.
> (simple_return): Likewise.
> (p_return_i): Likewise.
> 
> gcc/testsuite
> 2016-12-13  Claudiu Zissulescu  
> Andrew Burgess  
> 
> * gcc.target/arc/naked-1.c: New file.
> * gcc.target/arc/naked-2.c: Likewise.

This all looks fine,

Thanks,
Andrew


> ---
>  gcc/config/arc/arc-protos.h|   7 +-
>  gcc/config/arc/arc.c   | 160 
> -
>  gcc/config/arc/arc.h   |  40 ++---
>  gcc/config/arc/arc.md  |  10 ++-
>  gcc/testsuite/gcc.target/arc/naked-1.c |  18 
>  gcc/testsuite/gcc.target/arc/naked-2.c |  26 ++
>  6 files changed, 195 insertions(+), 66 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/naked-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/naked-2.c
> 
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index 93a64cf..f6bf14e 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -45,13 +45,10 @@ extern void arc_expand_atomic_op (enum rtx_code, rtx, 
> rtx, rtx, rtx, rtx);
>  extern void arc_split_compare_and_swap (rtx *);
>  extern void arc_expand_compare_and_swap (rtx *);
>  extern bool compact_memory_operand_p (rtx, machine_mode, bool, bool);
> +extern int arc_return_address_register (unsigned int);
> +extern unsigned int arc_compute_function_type (struct function *);
>  #endif /* RTX_CODE */
>  
> -#ifdef TREE_CODE
> -extern enum arc_function_type arc_compute_function_type (struct function *);
> -#endif /* TREE_CODE */
> -
> -
>  extern unsigned int arc_compute_frame_size (int);
>  extern bool arc_ccfsm_branch_deleted_p (void);
>  extern void arc_ccfsm_record_branch_deleted (void);
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index d9ad139..4ccd304 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -211,6 +211,7 @@ static int rgf_banked_register_count;
>  static int get_arc_condition_code (rtx);
>  
>  static tree arc_handle_interrupt_attribute (tree *, tree, tree, int, bool *);
> +static tree arc_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
>  
>  /* Initialized arc_attribute_table to NULL since arc doesnot have any
> machine specific supported attributes.  */
> @@ -229,6 +230,9 @@ const struct attribute_spec arc_attribute_table[] =
>/* And these functions are always known to reside within the 21 bit
>   addressing range of blcc.  */
>{ "short_call",   0, 0, false, true,  true,  NULL, false },
> +  /* Function which are not having the prologue and epilogue generated
> + by the compiler.  */
> +  { "naked", 0, 0, true, false, false, arc_handle_fndecl_attribute, false },
>{ NULL, 0, 0, false, false, false, NULL, false }
>  };
>  static int arc_comp_type_attributes (const_tree, const_tree);
> @@ -513,6 +517,12 @@ static void arc_finalize_pic (void);
>  #define TARGET_DIFFERENT_ADDR_DISPLACEMENT_P hook_bool_void_true
>  #define TARGET_SPILL_CLASS arc_spill_class
>  
> +#undef TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS
> +#define 

Re: [ping #4][patch] Fix PR80929: Realistic PARALLEL cost in seq_cost.

2017-07-13 Thread Georg-Johann Lay

On 12.07.2017 21:36, Segher Boessenkool wrote:

On Wed, Jul 12, 2017 at 03:30:00PM +0200, Georg-Johann Lay wrote:

On 12.07.2017 14:11, Segher Boessenkool wrote:

On Tue, Jul 11, 2017 at 10:47:27AM +0200, Georg-Johann Lay wrote:

This small addition improves costs of PARALLELs in
rtlanal.c:seq_cost().  Up to now, these costs are
assumed to be 1 which gives gross inexact costs for,
e.g. divmod which is represented as PARALLEL.


insn_rtx_cost returns 0 ("unknown") for such a PARALLEL, so your
current patch does not change this at all?


Ah I see now.

So this is unfixable...

Any change to seq_cost that would address the issue I had in mind
(completely wrong costs for divmod that are represented as PARALLEL
with 2 SETs) will come up with different handling than the "logic"
of insn_rtx_costs.

So the only way to avoid that "logic" is to pass the whole story
to the back-end.

And in order not to break any existing assumptions this can only
be achieved by a new hook that graceful degrades to the current
behaviour and reasoning when that new hook says "dunno".

I already started an RFC here:

https://gcc.gnu.org/ml/gcc/2017-07/msg00080.html


Johann


Re: Add support to trace comparison instructions and switch statements

2017-07-13 Thread Dmitry Vyukov via gcc-patches
On Thu, Jul 13, 2017 at 12:41 PM, Wish Wu  wrote:
> Hi
>
> In fact, under linux with "return address" and file "/proc/self/maps",
> we can give unique id for every comparison.

Yes, it's doable. But you expressed worries about performance hit of
merging callbacks for different sizes. Mapping pc + info from
/proc/self/maps to a unique id via an external map is an order of
magnitude slower than the hit of merged callbacks.


> For fuzzing, we may give 3 bits for every comparison as marker of if
> "<", "==" or ">" is showed. :D
>
> With Regards
> Wish Wu of Ant-financial Light-Year Security Lab
>
> On Thu, Jul 13, 2017 at 6:04 PM, Wish Wu  wrote:
>> Hi
>>
>> In my perspective:
>>
>> 1. Do we need to assign unique id for every comparison ?
>> Yes, I suggest to implement it like -fsanitize-coverage=trace-pc-guard .
>> Because some fuzzing targets may invoke dlopen() like functions to
>> load libraries(modules) after fork(), while these libraries are
>> compiled with trace-cmp as well.
>> With ALSR enabled by linker and/or kernel, return address can't be
>> a unique id for every comparison.
>>
>> 2. Should we merge cmp1(),cmp2(),cmp4(),cmp8(),cmpf(),cmpd() into one cmp() ?
>> No, It may reduce the performance of fuzzing. It may wastes
>> registers. But the number "switch" statements are much less than "if",
>> I forgive "switch"'s wasting behaviors.
>>
>> 3.Should we record operands(<,>,==,<= ..) ?
>> Probably no. As comparison,"<" , "==" and ">" all of them are
>> meaningful, because programmers must have some reasons to do that. As
>> practice , "==" is more meaningful.
>>
>> 4.Should we record comparisons for counting loop checks ?
>> Not sure.
>>
>> With Regards
>> Wish Wu of Ant-financial Light-Year Security Lab
>>
>> On Thu, Jul 13, 2017 at 4:09 PM, Dmitry Vyukov  wrote:
>>> On Tue, Jul 11, 2017 at 1:59 PM, Wish Wu  wrote:
 Hi

 I wrote a test for "-fsanitize-coverage=trace-cmp" .

 Is there anybody tells me if these codes could be merged into gcc ?
>>>
>>>
>>> Nice!
>>>
>>> We are currently working on Linux kernel fuzzing that use the
>>> comparison tracing. We use clang at the moment, but having this
>>> support in gcc would be great for kernel land.
>>>
>>> One concern I have: do we want to do some final refinements to the API
>>> before we implement this in both compilers?
>>>
>>> 2 things we considered from our perspective:
>>>  - communicating to the runtime which operands are constants
>>>  - communicating to the runtime which comparisons are counting loop checks
>>>
>>> First is useful if you do "find one operand in input and replace with
>>> the other one" thing. Second is useful because counting loop checks
>>> are usually not useful (at least all but one).
>>> In the original Go implementation I also conveyed signedness of
>>> operands, exact comparison operation (<, >, etc):
>>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-defs/defs.go#L13
>>> But I did not find any use for that.
>>> I also gave all comparisons unique IDs:
>>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-dep/sonar.go#L24
>>> That turned out to be useful. And there are chances we will want this
>>> for C/C++ as well.
>>>
>>> Kostya, did anything like this pop up in your work on libfuzzer?
>>> Can we still change the clang API? At least add an additional argument
>>> to the callbacks?
>>>
>>> At the very least I would suggest that we add an additional arg that
>>> contains some flags (1/2 arg is a const, this is counting loop check,
>>> etc). If we do that we can also have just 1 callback that accepts
>>> uint64's for args because we can pass operand size in the flags:
>>>
>>> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, uint64 flags);
>>>
>>> But I wonder if 3 uint64 args will be too inefficient for 32 bit archs?...
>>>
>>> If we create a global per comparison then we could put the flags into
>>> the global:
>>>
>>> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, something_t 
>>> *global);
>>>
>>> Thoughts?
>>>
>>>
>>>
>>>
 Index: gcc/testsuite/gcc.dg/sancov/basic3.c
 ===
 --- gcc/testsuite/gcc.dg/sancov/basic3.c (nonexistent)
 +++ gcc/testsuite/gcc.dg/sancov/basic3.c (working copy)
 @@ -0,0 +1,42 @@
 +/* Basic test on number of inserted callbacks.  */
 +/* { dg-do compile } */
 +/* { dg-options "-fsanitize-coverage=trace-cmp -fdump-tree-optimized" } */
 +
 +void foo(char *a, short *b, int *c, long long *d, float *e, double *f)
 +{
 +  if (*a)
 +*a += 1;
 +  if (*b)
 +*b = *a;
 +  if (*c)
 +*c += 1;
 +  if(*d)
 +*d = *c;
 +  if(*e == *c)
 +*e = *c;
 +  if(*f == *e)
 +*f = *e;
 +  switch(*a)
 +{
 +case 2:
 +  *b += 2;
 +  break;
 +default:

Re: Add support to trace comparison instructions and switch statements

2017-07-13 Thread Wish Wu
Hi

In fact, under linux with "return address" and file "/proc/self/maps",
we can give unique id for every comparison.

For fuzzing, we may give 3 bits for every comparison as marker of if
"<", "==" or ">" is showed. :D

With Regards
Wish Wu of Ant-financial Light-Year Security Lab

On Thu, Jul 13, 2017 at 6:04 PM, Wish Wu  wrote:
> Hi
>
> In my perspective:
>
> 1. Do we need to assign unique id for every comparison ?
> Yes, I suggest to implement it like -fsanitize-coverage=trace-pc-guard .
> Because some fuzzing targets may invoke dlopen() like functions to
> load libraries(modules) after fork(), while these libraries are
> compiled with trace-cmp as well.
> With ALSR enabled by linker and/or kernel, return address can't be
> a unique id for every comparison.
>
> 2. Should we merge cmp1(),cmp2(),cmp4(),cmp8(),cmpf(),cmpd() into one cmp() ?
> No, It may reduce the performance of fuzzing. It may wastes
> registers. But the number "switch" statements are much less than "if",
> I forgive "switch"'s wasting behaviors.
>
> 3.Should we record operands(<,>,==,<= ..) ?
> Probably no. As comparison,"<" , "==" and ">" all of them are
> meaningful, because programmers must have some reasons to do that. As
> practice , "==" is more meaningful.
>
> 4.Should we record comparisons for counting loop checks ?
> Not sure.
>
> With Regards
> Wish Wu of Ant-financial Light-Year Security Lab
>
> On Thu, Jul 13, 2017 at 4:09 PM, Dmitry Vyukov  wrote:
>> On Tue, Jul 11, 2017 at 1:59 PM, Wish Wu  wrote:
>>> Hi
>>>
>>> I wrote a test for "-fsanitize-coverage=trace-cmp" .
>>>
>>> Is there anybody tells me if these codes could be merged into gcc ?
>>
>>
>> Nice!
>>
>> We are currently working on Linux kernel fuzzing that use the
>> comparison tracing. We use clang at the moment, but having this
>> support in gcc would be great for kernel land.
>>
>> One concern I have: do we want to do some final refinements to the API
>> before we implement this in both compilers?
>>
>> 2 things we considered from our perspective:
>>  - communicating to the runtime which operands are constants
>>  - communicating to the runtime which comparisons are counting loop checks
>>
>> First is useful if you do "find one operand in input and replace with
>> the other one" thing. Second is useful because counting loop checks
>> are usually not useful (at least all but one).
>> In the original Go implementation I also conveyed signedness of
>> operands, exact comparison operation (<, >, etc):
>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-defs/defs.go#L13
>> But I did not find any use for that.
>> I also gave all comparisons unique IDs:
>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-dep/sonar.go#L24
>> That turned out to be useful. And there are chances we will want this
>> for C/C++ as well.
>>
>> Kostya, did anything like this pop up in your work on libfuzzer?
>> Can we still change the clang API? At least add an additional argument
>> to the callbacks?
>>
>> At the very least I would suggest that we add an additional arg that
>> contains some flags (1/2 arg is a const, this is counting loop check,
>> etc). If we do that we can also have just 1 callback that accepts
>> uint64's for args because we can pass operand size in the flags:
>>
>> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, uint64 flags);
>>
>> But I wonder if 3 uint64 args will be too inefficient for 32 bit archs?...
>>
>> If we create a global per comparison then we could put the flags into
>> the global:
>>
>> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, something_t 
>> *global);
>>
>> Thoughts?
>>
>>
>>
>>
>>> Index: gcc/testsuite/gcc.dg/sancov/basic3.c
>>> ===
>>> --- gcc/testsuite/gcc.dg/sancov/basic3.c (nonexistent)
>>> +++ gcc/testsuite/gcc.dg/sancov/basic3.c (working copy)
>>> @@ -0,0 +1,42 @@
>>> +/* Basic test on number of inserted callbacks.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-fsanitize-coverage=trace-cmp -fdump-tree-optimized" } */
>>> +
>>> +void foo(char *a, short *b, int *c, long long *d, float *e, double *f)
>>> +{
>>> +  if (*a)
>>> +*a += 1;
>>> +  if (*b)
>>> +*b = *a;
>>> +  if (*c)
>>> +*c += 1;
>>> +  if(*d)
>>> +*d = *c;
>>> +  if(*e == *c)
>>> +*e = *c;
>>> +  if(*f == *e)
>>> +*f = *e;
>>> +  switch(*a)
>>> +{
>>> +case 2:
>>> +  *b += 2;
>>> +  break;
>>> +default:
>>> +  break;
>>> +}
>>> +  switch(*d)
>>> +{
>>> +case 3:
>>> +  *d += 3;
>>> +case -4:
>>> +  *d -= 4;
>>> +}
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times
>>> "__builtin___sanitizer_cov_trace_cmp1 \\(" 1 "optimized" } } */
>>> +/* { dg-final { scan-tree-dump-times
>>> "__builtin___sanitizer_cov_trace_cmp2 \\(" 1 "optimized" } } */
>>> +/* { dg-final { scan-tree-dump-times
>>> 

RE: [Arm] Obsoleting Command line option -mstructure-size-boundary in eabi configurations

2017-07-13 Thread Michael Collison
Updated per Richard's comments and suggestions.

Okay for trunk?

2017-07-10  Michael Collison  

* config/arm/arm.c (arm_option_override): Deprecate
use of -mstructure-size-boundary.
* config/arm/arm.opt: Deprecate -mstructure-size-boundary.
* doc/invoke.texi: Deprecate -mstructure-size-boundary.

-Original Message-
From: Richard Earnshaw (lists) [mailto:richard.earns...@arm.com] 
Sent: Thursday, July 6, 2017 3:17 AM
To: Michael Collison ; GCC Patches 

Cc: nd 
Subject: Re: [Arm] Obsoleting Command line option -mstructure-size-boundary in 
eabi configurations

On 06/07/17 06:46, Michael Collison wrote:
> NetBSD/Arm requires that DEFAULT_STRUCTURE_SIZE_BOUNDARY (see 
> config/arm/netbsd-elf.h for details). This patch disallows 
> -mstructure-size-boundary on netbsd if the value is not equal to the 
> DEFAULT_STRUCTURE_SIZE_BOUNDARY.
> 
> Okay for trunk?
> 
> 2017-07-05  Michael Collison  
> 
>   * config/arm/arm.c (arm_option_override): Disallow
>   -mstructure-size-boundary on netbsd if value is not
>   DEFAULT_STRUCTURE_SIZE_BOUNDARY.
> 
> 

Frankly, I'd rather we moved towards obsoleting this option entirely.
The origins are from the days of the APCS (note, not AAPCS) when the default 
was 32 when most of the world expected 8.

Now that the AAPCS is widely adopted, APCS is obsolete (NetBSD uses
ATPCS) and NetBSD (the only port not based on AAPCS these days) defaults to 8 I 
can't see why anybody now would be interested in using a different value.

So let's just mark this option as deprecated (emit a warning if

global_options_set.x_arm_structure_size_boundary

is ever set by the user, regardless of value).  Then in GCC 9 we can perhaps 
remove this code entirely.

Documentation and release notes will need corresponding updates as well.

R.

> pr1556.patch
> 
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 
> bc1e607..911c272 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3471,7 +3471,18 @@ arm_option_override (void)
>  }
>else
>  {
> -  if (arm_structure_size_boundary != 8
> +  /* Do not allow structure size boundary to be overridden for 
> + netbsd.  */
> +
> +  if ((arm_abi == ARM_ABI_ATPCS)
> +   && (arm_structure_size_boundary != DEFAULT_STRUCTURE_SIZE_BOUNDARY))
> + {
> +   warning (0,
> +"option %<-mstructure-size-boundary%> is deprecated for 
> netbsd; "
> +"defaulting to %d",
> +DEFAULT_STRUCTURE_SIZE_BOUNDARY);
> +   arm_structure_size_boundary = DEFAULT_STRUCTURE_SIZE_BOUNDARY;
> + }
> +  else if (arm_structure_size_boundary != 8
> && arm_structure_size_boundary != 32
> && !(ARM_DOUBLEWORD_ALIGN && arm_structure_size_boundary == 64))
>   {
> 



pr1556v2.patch
Description: pr1556v2.patch


Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Michael Matz
Hello,

On Tue, 11 Jul 2017, Jeff Law wrote:

> This patch series is designed to mitigate the problems exposed by the
> stack-clash exploits.  As I've noted before, the way to address this
> class of problems is via a good stack probing strategy.
> 
> This has taken much longer than expected to pull together for
> submission.  Sorry about that.  However, the delay has led to some clear
> improvements on ppc, aarch64 and s390 as well as tests which aren't
> eyeballed, but instead are part of the testsuite.
> 
> This series introduces -fstack-check=clash which is a variant of
> -fstack-check designed to prevent "jumping the stack" as seen in the
> stack-clash exploits.

FWIW, this is the patch we're going to use in our older compilers (back up 
to 4.1, meh) in one or another variant.  It only probes for dynamic 
allocations, not for static stack frames.  And it probes more often than 
strictly necessary.  But on the plus side it is completely target 
independend (except STACK_GROWS_DOWNWARD, which it doesn't handle because 
we don't have hppa) and only 70 lines, doesn't interact with any of the 
hairy existing stack checking code and it's easy to see that it does the 
right thing :)

(This particular variant is for 4.3, but the code of 
allocate_dynamic_stack_space() is essentially stable since a very long 
time, which is another plus thing of this patch, it's easy to back- and 
forward-port :) )

I'm not suggesting this for inclusion, but in case others are in a similar 
position of having to deal with old compilers and are fine with the above, 
they might find this useful.


Ciao,
Michael.
--- gcc/common.opt.mm   2017-06-26 16:07:55.0 +0200
+++ gcc/common.opt  2017-06-26 16:05:27.0 +0200
@@ -966,6 +966,10 @@ fstack-check
 Common Report Var(flag_stack_check)
 Insert stack checking code into the program
 
+fstack-probe
+Common Report Var(flag_stack_probe)
+Insert stack checking code into the program
+
 fstack-limit
 Common
 
--- gcc/explow.c.mm 2008-11-05 22:19:47.0 +0100
+++ gcc/explow.c2017-06-26 17:31:25.0 +0200
@@ -1071,6 +1071,9 @@ update_nonlocal_goto_save_area (void)
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  rtx loop_lab, end_lab, last_size;
+  int probe_pass = 0;
+
   /* If we're asking for zero bytes, it doesn't matter what we point
  to since we can't dereference it.  But return a reasonable
  address anyway.  */
@@ -1203,6 +1206,24 @@ allocate_dynamic_stack_space (rtx size,
 
   mark_reg_pointer (target, known_align);
 
+  if (flag_stack_probe)
+{
+  size = copy_to_mode_reg (Pmode, convert_to_mode (Pmode, size, 1));
+  loop_lab = gen_label_rtx ();
+  end_lab = gen_label_rtx ();
+  emit_label (loop_lab);
+#ifndef STACK_GROWS_DOWNWARD
+#error stack must grow down
+#endif
+  emit_cmp_and_jump_insns (size, GEN_INT (STACK_CHECK_PROBE_INTERVAL), LTU,
+  NULL_RTX, Pmode, 1, end_lab);
+  last_size = expand_binop (Pmode, sub_optab, size, GEN_INT 
(STACK_CHECK_PROBE_INTERVAL), size,
+   1, OPTAB_WIDEN);
+  gcc_assert (last_size == size);
+  size = GEN_INT (STACK_CHECK_PROBE_INTERVAL);
+}
+
+again:
   /* Perform the required allocation from the stack.  Some systems do
  this differently than simply incrementing/decrementing from the
  stack pointer, such as acquiring the space by calling malloc().  */
@@ -1264,6 +1285,15 @@ allocate_dynamic_stack_space (rtx size,
   emit_move_insn (target, virtual_stack_dynamic_rtx);
 #endif
 }
+  if (flag_stack_probe && probe_pass == 0)
+{
+  probe_pass = 1;
+  emit_stack_probe (target);
+  emit_jump (loop_lab);
+  emit_label (end_lab);
+  size = last_size;
+  goto again;
+}
 
   if (MUST_ALIGN)
 {
@@ -1280,6 +1310,8 @@ allocate_dynamic_stack_space (rtx size,
GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
NULL_RTX, 1);
 }
+  if (flag_stack_probe)
+emit_stack_probe (target);
 
   /* Record the new stack level for nonlocal gotos.  */
   if (cfun->nonlocal_goto_save_area != 0)


Re: Add support to trace comparison instructions and switch statements

2017-07-13 Thread Wish Wu
Hi

In my perspective:

1. Do we need to assign unique id for every comparison ?
Yes, I suggest to implement it like -fsanitize-coverage=trace-pc-guard .
Because some fuzzing targets may invoke dlopen() like functions to
load libraries(modules) after fork(), while these libraries are
compiled with trace-cmp as well.
With ALSR enabled by linker and/or kernel, return address can't be
a unique id for every comparison.

2. Should we merge cmp1(),cmp2(),cmp4(),cmp8(),cmpf(),cmpd() into one cmp() ?
No, It may reduce the performance of fuzzing. It may wastes
registers. But the number "switch" statements are much less than "if",
I forgive "switch"'s wasting behaviors.

3.Should we record operands(<,>,==,<= ..) ?
Probably no. As comparison,"<" , "==" and ">" all of them are
meaningful, because programmers must have some reasons to do that. As
practice , "==" is more meaningful.

4.Should we record comparisons for counting loop checks ?
Not sure.

With Regards
Wish Wu of Ant-financial Light-Year Security Lab

On Thu, Jul 13, 2017 at 4:09 PM, Dmitry Vyukov  wrote:
> On Tue, Jul 11, 2017 at 1:59 PM, Wish Wu  wrote:
>> Hi
>>
>> I wrote a test for "-fsanitize-coverage=trace-cmp" .
>>
>> Is there anybody tells me if these codes could be merged into gcc ?
>
>
> Nice!
>
> We are currently working on Linux kernel fuzzing that use the
> comparison tracing. We use clang at the moment, but having this
> support in gcc would be great for kernel land.
>
> One concern I have: do we want to do some final refinements to the API
> before we implement this in both compilers?
>
> 2 things we considered from our perspective:
>  - communicating to the runtime which operands are constants
>  - communicating to the runtime which comparisons are counting loop checks
>
> First is useful if you do "find one operand in input and replace with
> the other one" thing. Second is useful because counting loop checks
> are usually not useful (at least all but one).
> In the original Go implementation I also conveyed signedness of
> operands, exact comparison operation (<, >, etc):
> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-defs/defs.go#L13
> But I did not find any use for that.
> I also gave all comparisons unique IDs:
> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-dep/sonar.go#L24
> That turned out to be useful. And there are chances we will want this
> for C/C++ as well.
>
> Kostya, did anything like this pop up in your work on libfuzzer?
> Can we still change the clang API? At least add an additional argument
> to the callbacks?
>
> At the very least I would suggest that we add an additional arg that
> contains some flags (1/2 arg is a const, this is counting loop check,
> etc). If we do that we can also have just 1 callback that accepts
> uint64's for args because we can pass operand size in the flags:
>
> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, uint64 flags);
>
> But I wonder if 3 uint64 args will be too inefficient for 32 bit archs?...
>
> If we create a global per comparison then we could put the flags into
> the global:
>
> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, something_t *global);
>
> Thoughts?
>
>
>
>
>> Index: gcc/testsuite/gcc.dg/sancov/basic3.c
>> ===
>> --- gcc/testsuite/gcc.dg/sancov/basic3.c (nonexistent)
>> +++ gcc/testsuite/gcc.dg/sancov/basic3.c (working copy)
>> @@ -0,0 +1,42 @@
>> +/* Basic test on number of inserted callbacks.  */
>> +/* { dg-do compile } */
>> +/* { dg-options "-fsanitize-coverage=trace-cmp -fdump-tree-optimized" } */
>> +
>> +void foo(char *a, short *b, int *c, long long *d, float *e, double *f)
>> +{
>> +  if (*a)
>> +*a += 1;
>> +  if (*b)
>> +*b = *a;
>> +  if (*c)
>> +*c += 1;
>> +  if(*d)
>> +*d = *c;
>> +  if(*e == *c)
>> +*e = *c;
>> +  if(*f == *e)
>> +*f = *e;
>> +  switch(*a)
>> +{
>> +case 2:
>> +  *b += 2;
>> +  break;
>> +default:
>> +  break;
>> +}
>> +  switch(*d)
>> +{
>> +case 3:
>> +  *d += 3;
>> +case -4:
>> +  *d -= 4;
>> +}
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmp1 \\(" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmp2 \\(" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmp4 \\(" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmp8 \\(" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmpf \\(" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmpd \\(" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_switch \\(" 2 "optimized" } } */
>>
>>
>> With Regards
>> Wish Wu
>>
>> On 

[ARM, VXworks] Fix build

2017-07-13 Thread Richard Earnshaw (lists)
My patch last week to address selection of be8 linking mode broke the
build for vxworks.  It turns out that this port is one of the few
remaining that is still not based on the EABI/AAPCS.

This patch fixes the build, but I've not really tested it beyond
building the core compiler binaries.  Building a workable compiler
entails downloading a load of vxworks stuff that I'm not sure where to find.

The port is also *very* out-of-date.  Not only does it not use the EABI,
but it hasn't had support for any core added since ARMv5 (and ARMv6 was
announced in 2002)!

I therefore propose that we consider this port for deprecation.

* config/arm/vxworks.h (TARGET_ENDIAN_DEFAULT): Define.
diff --git a/gcc/config/arm/vxworks.h b/gcc/config/arm/vxworks.h
index 9af37c7..f20324f 100644
--- a/gcc/config/arm/vxworks.h
+++ b/gcc/config/arm/vxworks.h
@@ -117,3 +117,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 /* This platform supports the probing method of stack checking (RTP mode).
8K is reserved in the stack to propagate exceptions in case of overflow.  */
 #define STACK_CHECK_PROTECT 8192
+
+/* Unless overridded by the target options, the default is little-endian.  */
+#define TARGET_ENDIAN_DEFAULT 0


Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-13 Thread Christophe Lyon
Hi Jeff,


On 11 July 2017 at 23:19, Jeff Law  wrote:
> This patch series is designed to mitigate the problems exposed by the
> stack-clash exploits.  As I've noted before, the way to address this
> class of problems is via a good stack probing strategy.
>
> This has taken much longer than expected to pull together for
> submission.  Sorry about that.  However, the delay has led to some clear
> improvements on ppc, aarch64 and s390 as well as tests which aren't
> eyeballed, but instead are part of the testsuite.
>
> This series introduces -fstack-check=clash which is a variant of
> -fstack-check designed to prevent "jumping the stack" as seen in the
> stack-clash exploits.
>
>
>
> The key ideas here:
>
> Individual stack allocations are never more than PROBE_INTERVAL in size
> (4k by default).  Larger allocations are broken up into PROBE_INTERVAL
> chunks and each chunk is probed as it is allocated.
>
> No combination of stack allocations can exceed PROBE_INTERVAL bytes
> without probing.  ie, if we have an allocation of 2k and a later
> allocation of 3k, then there must be a stack probe into the first 4k of
> allocated space that executes between the two allocations.
>
> We must consider an environment where code compiled without stack
> probing is linked statically or dynamically with code that is compiled
> with stack probing.  That is actually the most likely scenario for an
> indefinite period of time.  Thus we have to consider the possibility of
> a hostile caller in the call stack.
>
> We need not guarantee enough stack space to handle a signal if a probe
> hits the guard page.
>
> --
>
>
> Probes come in two forms.  They can be explicit or implicit.
>
> Explicit probes are emitted by prologue generation or dynamic stack
> allocation routines.  These are net new code and avoiding them when it
> is safe to do so helps reduce the overhead of stack probing.
>
> Implicit probes are "probes" that occur as a natural side effect of the
> existing code or guarantees provided by the ABI.  They are essentially
> free and may allow the compiler to avoid some explicit probes.
>
> Examples of implicit probes include
>
>   1. ISA which pushes the return address onto the stack in a call
>  instruction (x86)
>
>   2. ABI mandates that *sp always contain a backchain pointer (ppc)
>
>   3. Prologue stores a register into the stack.  We exploit this on
>  aarch64 and s390.  On s390 register saves go into the caller's
>  stack frame, on aarch64 register saves hit newly allocated
>  space in the callee's frame.  We can exploit both to avoid
>  some explicit probing.
>
> I've done implementations for x86, ppc, aarch64 and s390 and the
> included tests have been checked against those targets
> ($arch-unknown-linux).
>
> This patch does not change the probing insn itself.  We've had various
> discussions on-list on a better probe insn for x86.  I think the
> consensus is to avoid read-modify-write insns.  A testb may ultimately
> be best.  This is IMHO an independent implementation detail for each
> target and should be handled as a follow-up.  But if folks insist, it's
> a trivial change to make as it doesn't fundamentally affect how all this
> stuff works.
>
> Other targets that have an existing -fstack-check=specific, but for
> which I have not added a -fstack-check=clash implementation get partial
> protection against stack clash as well.  This is a side effect of
> keeping some of the early code we'd hoped to use to avoid writing a new
> probe implementation for each target.
>
> --
>
> To get a sense of overhead, just 1.5% of routines in glibc need probing
> in their prologues (x86) in the testing I performed.  IIRC each and
> every one of those routines needed just 1-4 inlined probes.
>
> Significantly more functions need alloca space probed (IIRC ~5%), but
> given the amazingly inefficient alloca code, I can't believe anyone will
> ever notice the probing overhead.
>
> --
>
>
> Patch #1 contains the new option -fstack-check=clash and some dejagnu
> infrastructure  (most of which is unused until later patches)
>
> Patch #2 adds the new style probing support to the alloca/vla area and
> indirects uses of STACK_CHECK_PROTECT through get_stack_check_protect.
>
> Patch #3 Add some generic dumping support for use by the target prologue
> expanders
>
> Patch #4 introduces the x86 specific bits
>
> Patch #5 addresses combine-stack-adjustments interactions with
> -fstack-check=clash
>
> Patch #6 adds PPC support
>
> Patch #7 adds aarch64 support
>
> Patch #8 adds s390 support
>
> The patch series has been bootstrapped and regression tested on
> x86_64-linux-gnu
> ppc64-linux-gnu
> ppc64le-linux-gnu
> aarch64-linux-gnu
> s390x-linux-gnu (another respin of this is still in-progress)
>
> Additionally, each target has been bootstrapped with -fstack-check=clash
> enabled by default, the testsuite run and checked for glaring errors.
>
> Earlier versions have also bootstrapped on 32bit PPC and 

[77/77] Add a complex_mode class

2017-07-13 Thread Richard Sandiford
This patch adds another machine_mode wrapper for modes that are
known to be COMPLEX_MODE_P.  There aren't yet many places that make
use of it, but that might change in future.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* coretypes.h (complex_mode): New type.
* gdbhooks.py (build_pretty_printer): Handle it.
* machmode.h (complex_mode): New class.
(complex_mode::includes_p): New function.
(is_complex_int_mode): Likewise.
(is_complex_float_mode): Likewise.
* genmodes.c (get_mode_class): Handle complex mode classes.
* function.c (expand_function_end): Use is_complex_int_mode.

gcc/go/
* go-lang.c (go_langhook_type_for_mode): Use is_complex_float_mode.

Index: gcc/coretypes.h
===
--- gcc/coretypes.h 2017-07-13 09:19:00.088160188 +0100
+++ gcc/coretypes.h 2017-07-13 09:19:00.526129740 +0100
@@ -58,6 +58,7 @@ typedef const struct rtx_def *const_rtx;
 class scalar_mode;
 class scalar_int_mode;
 class scalar_float_mode;
+class complex_mode;
 template class opt_mode;
 typedef opt_mode opt_scalar_mode;
 typedef opt_mode opt_scalar_int_mode;
@@ -323,6 +324,7 @@ #define const_tree union _dont_use_tree_
 typedef struct scalar_mode scalar_mode;
 typedef struct scalar_int_mode scalar_int_mode;
 typedef struct scalar_float_mode scalar_float_mode;
+typedef struct complex_mode complex_mode;
 
 #endif
 
Index: gcc/gdbhooks.py
===
--- gcc/gdbhooks.py 2017-07-13 09:19:00.090160049 +0100
+++ gcc/gdbhooks.py 2017-07-13 09:19:00.527129670 +0100
@@ -551,7 +551,8 @@ def build_pretty_printer():
 pp.add_printer_for_types(['scalar_int_mode_pod',
   'scalar_mode_pod'],
  'pod_mode', MachineModePrinter)
-for mode in 'scalar_mode', 'scalar_int_mode', 'scalar_float_mode':
+for mode in ('scalar_mode', 'scalar_int_mode', 'scalar_float_mode',
+ 'complex_mode'):
 pp.add_printer_for_types([mode], mode, MachineModePrinter)
 
 return pp
Index: gcc/machmode.h
===
--- gcc/machmode.h  2017-07-13 09:19:00.090160049 +0100
+++ gcc/machmode.h  2017-07-13 09:19:00.528129601 +0100
@@ -451,6 +451,30 @@ scalar_mode::includes_p (machine_mode m)
 }
 }
 
+/* Represents a machine mode that is known to be a COMPLEX_MODE_P.  */
+class complex_mode
+{
+public:
+  typedef mode_traits::from_int from_int;
+
+  ALWAYS_INLINE complex_mode () {}
+  ALWAYS_INLINE complex_mode (from_int m) : m_mode (machine_mode (m)) {}
+  ALWAYS_INLINE operator machine_mode () const { return m_mode; }
+
+  static bool includes_p (machine_mode);
+
+protected:
+  machine_mode m_mode;
+};
+
+/* Return true if M is a complex_mode.  */
+
+inline bool
+complex_mode::includes_p (machine_mode m)
+{
+  return COMPLEX_MODE_P (m);
+}
+
 /* Return the base GET_MODE_SIZE value for MODE.  */
 
 ALWAYS_INLINE unsigned short
@@ -770,6 +794,36 @@ is_float_mode (machine_mode mode, T *flo
   return true;
 }
   return false;
+}
+
+/* Return true if MODE has class MODE_COMPLEX_INT, storing it as
+   a complex_mode in *CMODE if so.  */
+
+template
+inline bool
+is_complex_int_mode (machine_mode mode, T *cmode)
+{
+  if (GET_MODE_CLASS (mode) == MODE_COMPLEX_INT)
+{
+  *cmode = complex_mode (complex_mode::from_int (mode));
+  return true;
+}
+  return false;
+}
+
+/* Return true if MODE has class MODE_COMPLEX_FLOAT, storing it as
+   a complex_mode in *CMODE if so.  */
+
+template
+inline bool
+is_complex_float_mode (machine_mode mode, T *cmode)
+{
+  if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT)
+{
+  *cmode = complex_mode (complex_mode::from_int (mode));
+  return true;
+}
+  return false;
 }
 
 namespace mode_iterator
Index: gcc/genmodes.c
===
--- gcc/genmodes.c  2017-07-13 09:18:53.274650323 +0100
+++ gcc/genmodes.c  2017-07-13 09:19:00.527129670 +0100
@@ -1152,6 +1152,10 @@ get_mode_class (struct mode_data *mode)
 case MODE_DECIMAL_FLOAT:
   return "scalar_float_mode";
 
+case MODE_COMPLEX_INT:
+case MODE_COMPLEX_FLOAT:
+  return "complex_mode";
+
 default:
   return NULL;
 }
Index: gcc/function.c
===
--- gcc/function.c  2017-07-13 09:18:53.273650396 +0100
+++ gcc/function.c  2017-07-13 09:19:00.527129670 +0100
@@ -5503,6 +5503,7 @@ expand_function_end (void)
  : DECL_REGISTER (decl_result))
{
  rtx real_decl_rtl = crtl->return_rtx;
+ complex_mode cmode;
 
  /* This should be set in assign_parms.  */
  gcc_assert (REG_FUNCTION_VALUE_P 

[76/77] Add a scalar_mode_pod class

2017-07-13 Thread Richard Sandiford
This patch adds a scalar_mode_pod class and uses it to
replace the machine_mode in fixed_value.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* coretypes.h (scalar_mode_pod): New typedef.
* gdbhooks.py (build_pretty_printer): Handle it.
* machmode.h (gt_ggc_mx, gt_pch_nx): New functions.
* fixed-value.h (fixed_value::mode): Change type to scalar_mode_pod.
* fold-const.c (fold_convert_const_int_from_fixed): Use scalar_mode.
* tree-streamer-in.c (unpack_ts_fixed_cst_value_fields): Use
as_a .

Index: gcc/coretypes.h
===
--- gcc/coretypes.h 2017-07-13 09:18:56.810392248 +0100
+++ gcc/coretypes.h 2017-07-13 09:19:00.088160188 +0100
@@ -63,6 +63,7 @@ typedef opt_mode opt_scalar
 typedef opt_mode opt_scalar_int_mode;
 typedef opt_mode opt_scalar_float_mode;
 template class pod_mode;
+typedef pod_mode scalar_mode_pod;
 typedef pod_mode scalar_int_mode_pod;
 
 /* Subclasses of rtx_def, using indentation to show the class
Index: gcc/gdbhooks.py
===
--- gcc/gdbhooks.py 2017-07-13 09:18:56.812392104 +0100
+++ gcc/gdbhooks.py 2017-07-13 09:19:00.090160049 +0100
@@ -548,7 +548,8 @@ def build_pretty_printer():
  'opt_mode', OptMachineModePrinter)
 pp.add_printer_for_regex(r'pod_mode<(\S+)>',
  'pod_mode', MachineModePrinter)
-pp.add_printer_for_types(['scalar_int_mode_pod'],
+pp.add_printer_for_types(['scalar_int_mode_pod',
+  'scalar_mode_pod'],
  'pod_mode', MachineModePrinter)
 for mode in 'scalar_mode', 'scalar_int_mode', 'scalar_float_mode':
 pp.add_printer_for_types([mode], mode, MachineModePrinter)
Index: gcc/machmode.h
===
--- gcc/machmode.h  2017-07-13 09:18:59.187223319 +0100
+++ gcc/machmode.h  2017-07-13 09:19:00.090160049 +0100
@@ -894,4 +894,22 @@ #define FOR_EACH_2XWIDER_MODE(ITERATOR,
mode_iterator::iterate_p (&(ITERATOR)); \
mode_iterator::get_2xwider (&(ITERATOR)))
 
+template
+void
+gt_ggc_mx (pod_mode *)
+{
+}
+
+template
+void
+gt_pch_nx (pod_mode *)
+{
+}
+
+template
+void
+gt_pch_nx (pod_mode *, void (*) (void *, void *), void *)
+{
+}
+
 #endif /* not HAVE_MACHINE_MODES */
Index: gcc/fixed-value.h
===
--- gcc/fixed-value.h   2017-07-13 09:18:55.158511776 +0100
+++ gcc/fixed-value.h   2017-07-13 09:19:00.088160188 +0100
@@ -22,8 +22,8 @@ #define GCC_FIXED_VALUE_H
 
 struct GTY(()) fixed_value
 {
-  double_int data; /* Store data up to 2 wide integers.  */
-  machine_mode mode;   /* Use machine mode to know IBIT and FBIT.  */
+  double_int data;   /* Store data up to 2 wide integers.  */
+  scalar_mode_pod mode;  /* Use machine mode to know IBIT and FBIT.  */
 };
 
 #define FIXED_VALUE_TYPE struct fixed_value
Index: gcc/fold-const.c
===
--- gcc/fold-const.c2017-07-13 09:18:53.998596742 +0100
+++ gcc/fold-const.c2017-07-13 09:19:00.090160049 +0100
@@ -1952,7 +1952,7 @@ fold_convert_const_int_from_fixed (tree
 {
   tree t;
   double_int temp, temp_trunc;
-  machine_mode mode;
+  scalar_mode mode;
 
   /* Right shift FIXED_CST to temp by fbit.  */
   temp = TREE_FIXED_CST (arg1).data;
Index: gcc/tree-streamer-in.c
===
--- gcc/tree-streamer-in.c  2017-05-03 08:46:32.776861592 +0100
+++ gcc/tree-streamer-in.c  2017-07-13 09:19:00.090160049 +0100
@@ -208,7 +208,7 @@ unpack_ts_real_cst_value_fields (struct
 unpack_ts_fixed_cst_value_fields (struct bitpack_d *bp, tree expr)
 {
   FIXED_VALUE_TYPE *fp = ggc_alloc ();
-  fp->mode = bp_unpack_machine_mode (bp);
+  fp->mode = as_a  (bp_unpack_machine_mode (bp));
   fp->data.low = bp_unpack_var_len_int (bp);
   fp->data.high = bp_unpack_var_len_int (bp);
   TREE_FIXED_CST_PTR (expr) = fp;


[75/77] Use scalar_mode in the AArch64 port

2017-07-13 Thread Richard Sandiford
Similar to the previous scalar_int_mode patch.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* config/aarch64/aarch64-protos.h (aarch64_gen_adjusted_ldpstp):
Take a scalar_mode rather than a machine_mode.
(aarch64_operands_adjust_ok_for_ldpstp): Likewise.
* config/aarch64/aarch64.c (aarch64_simd_container_mode): Likewise.
(aarch64_operands_adjust_ok_for_ldpstp): Likewise.
(aarch64_gen_adjusted_ldpstp): Likewise.
(aarch64_expand_vector_init): Use scalar_mode instead of machine_mode.

Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2017-07-13 09:18:50.737840686 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2017-07-13 09:18:59.629192323 +0100
@@ -443,7 +443,7 @@ bool aarch64_atomic_ldop_supported_p (en
 void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
-bool aarch64_gen_adjusted_ldpstp (rtx *, bool, machine_mode, RTX_CODE);
+bool aarch64_gen_adjusted_ldpstp (rtx *, bool, scalar_mode, RTX_CODE);
 #endif /* RTX_CODE */
 
 void aarch64_init_builtins (void);
@@ -471,7 +471,7 @@ int aarch64_ccmp_mode_to_code (machine_m
 
 bool extract_base_offset_in_addr (rtx mem, rtx *base, rtx *offset);
 bool aarch64_operands_ok_for_ldpstp (rtx *, bool, machine_mode);
-bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, machine_mode);
+bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, scalar_mode);
 
 extern void aarch64_asm_output_pool_epilogue (FILE *, const char *,
  tree, HOST_WIDE_INT);
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-07-13 09:18:58.445275699 +0100
+++ gcc/config/aarch64/aarch64.c2017-07-13 09:18:59.630192253 +0100
@@ -6,7 +6,7 @@ aarch64_vector_mode_supported_p (machine
 /* Return appropriate SIMD container
for MODE within a vector of WIDTH bits.  */
 static machine_mode
-aarch64_simd_container_mode (machine_mode mode, unsigned width)
+aarch64_simd_container_mode (scalar_mode mode, unsigned width)
 {
   gcc_assert (width == 64 || width == 128);
   if (TARGET_SIMD)
@@ -11882,7 +11882,7 @@ aarch64_simd_make_constant (rtx vals)
 aarch64_expand_vector_init (rtx target, rtx vals)
 {
   machine_mode mode = GET_MODE (target);
-  machine_mode inner_mode = GET_MODE_INNER (mode);
+  scalar_mode inner_mode = GET_MODE_INNER (mode);
   /* The number of vector elements.  */
   int n_elts = GET_MODE_NUNITS (mode);
   /* The number of vector elements which are not constant.  */
@@ -14684,7 +14684,7 @@ aarch64_operands_ok_for_ldpstp (rtx *ope
 
 bool
 aarch64_operands_adjust_ok_for_ldpstp (rtx *operands, bool load,
-  machine_mode mode)
+  scalar_mode mode)
 {
   enum reg_class rclass_1, rclass_2, rclass_3, rclass_4;
   HOST_WIDE_INT offval_1, offval_2, offval_3, offval_4, msize;
@@ -14818,7 +14818,7 @@ aarch64_operands_adjust_ok_for_ldpstp (r
 
 bool
 aarch64_gen_adjusted_ldpstp (rtx *operands, bool load,
-machine_mode mode, RTX_CODE code)
+scalar_mode mode, RTX_CODE code)
 {
   rtx base, offset, t1, t2;
   rtx mem_1, mem_2, mem_3, mem_4;


Re: [Patch][Aarch64] Refactor comments in aarch64_print_operand

2017-07-13 Thread James Greenhalgh
On Tue, Jul 11, 2017 at 05:29:11PM +0100, Jackson Woodruff wrote:
> Hi all,
> 
> This patch refactors comments in config/aarch64/aarch64.c
> aarch64_print_operand
> to provide a table of aarch64 specific formating options.
> 
> I've tested the patch with a bootstrap and testsuite run on aarch64.
> 
> OK for trunk?

Hi Jackson,

Thanks for the patch, I have a few comments, but overall this looks
like a nice improvement.

> Changelog:
> 
> gcc/
> 
> 2017-07-04  Jackson Woodruff  
> 
>  * config/aarch64/aarch64.c (aarch64_print_operand):
>Move comments to top of function.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 037339d431d80c49699446e548d6b2707883b6a8..91bf4b3e9792e4ba01232f099ed844bdf23392fa
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -5053,12 +5053,39 @@ static const int aarch64_nzcv_codes[] =
>0  /* NV, Any.  */
>  };
>  
> +/* aarch64 specific string formatting commands:

s/aarch64/AArch64/
s/string/operand/

Most functions in GCC should have a comment describing the arguments they
take as well as what they do, so I suppose I'd prefer to see something like:

/* Print operand X to file F in a target specific manner according to CODE.
   The acceptable formatting commands given by CODE are:
   [...]

> + 'c':An integer or symbol address without a preceding # sign.
> + 'e':Print the sign/zero-extend size as a character 8->b,
> + 16->h, 32->w.
> + 'p':Prints N such that 2^N == X (X must be power of 2 and
> + const int).
> + 'P':Print the number of non-zero bits in X (a const_int).
> + 'H':Print the higher numbered register of a pair (TImode)
> + of regs.
> + 'm':Print a condition (eq, ne, etc).
> + 'M':Same as 'm', but invert condition.
> + 'b/q/h/s/d':Print a scalar FP/SIMD register name. 

Put these in size order - b/h/s/d/q

> + 'S/T/U/V':  Print the first FP/SIMD register name in a list.

It might be useful to expand in this comment what the difference is between
S T U and V.

> + 'R':Print a scalar FP/SIMD register name + 1.
> + 'X':Print bottom 16 bits of integer constant in hex.
> + 'w/x':  Print a general register name or the zero register
> + (32-bit or 64-bit).
> + '0':Print a normal operand, if it's a general register,
> + then we assume DImode.
> + 'k':Print nzcv.

This one doesn't make sense to me and could do with some clarification. Maybe
Print the  field for CCMP.

Thanks,
James

> + 'A':Output address constant representing the first
> + argument of X, specifying a relocation offset
> + if appropriate.
> + 'L':Output constant address specified by X
> + with a relocation offset if appropriate.
> + 'G':Prints address of X, specifying a PC relative
> + relocation mode if appropriate.  */
> +



[74/77] Various small scalar_mode changes

2017-07-13 Thread Richard Sandiford
This patch uses scalar_mode in a few miscellaneous places:

- Previous patches mean mode_to_vector can take a scalar_mode without
  further changes.

- Implicit promotion is limited to scalar types (affects promote_mode
  and sdbout_parms)

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* machmode.h (mode_for_vector): Take a scalar_mode instead
of a machine_mode.
* stor-layout.c (mode_for_vector): Likewise.
* explow.c (promote_mode): Use as_a .
* sdbout.c (sdbout_parms): Use is_a .

Index: gcc/machmode.h
===
--- gcc/machmode.h  2017-07-13 09:18:56.812392104 +0100
+++ gcc/machmode.h  2017-07-13 09:18:59.187223319 +0100
@@ -651,7 +651,7 @@ extern machine_mode bitwise_mode_for_mod
 /* Return a mode that is suitable for representing a vector,
or BLKmode on failure.  */
 
-extern machine_mode mode_for_vector (machine_mode, unsigned);
+extern machine_mode mode_for_vector (scalar_mode, unsigned);
 
 /* A class for iterating through possible bitfield modes.  */
 class bit_field_mode_iterator
Index: gcc/stor-layout.c
===
--- gcc/stor-layout.c   2017-07-13 09:18:53.998596742 +0100
+++ gcc/stor-layout.c   2017-07-13 09:18:59.187223319 +0100
@@ -478,7 +478,7 @@ bitwise_type_for_mode (machine_mode mode
is no suitable mode.  */
 
 machine_mode
-mode_for_vector (machine_mode innermode, unsigned nunits)
+mode_for_vector (scalar_mode innermode, unsigned nunits)
 {
   machine_mode mode;
 
Index: gcc/explow.c
===
--- gcc/explow.c2017-07-13 09:18:54.682546579 +0100
+++ gcc/explow.c2017-07-13 09:18:59.186223389 +0100
@@ -787,6 +787,7 @@ promote_mode (const_tree type ATTRIBUTE_
 #ifdef PROMOTE_MODE
   enum tree_code code;
   int unsignedp;
+  scalar_mode smode;
 #endif
 
   /* For libcalls this is invoked without TYPE from the backends
@@ -806,9 +807,11 @@ promote_mode (const_tree type ATTRIBUTE_
 {
 case INTEGER_TYPE:   case ENUMERAL_TYPE:   case BOOLEAN_TYPE:
 case REAL_TYPE:  case OFFSET_TYPE: case FIXED_POINT_TYPE:
-  PROMOTE_MODE (mode, unsignedp, type);
+  /* Values of these types always have scalar mode.  */
+  smode = as_a  (mode);
+  PROMOTE_MODE (smode, unsignedp, type);
   *punsignedp = unsignedp;
-  return mode;
+  return smode;
 
 #ifdef POINTERS_EXTEND_UNSIGNED
 case REFERENCE_TYPE:
Index: gcc/sdbout.c
===
--- gcc/sdbout.c2017-02-23 19:54:15.0 +
+++ gcc/sdbout.c2017-07-13 09:18:59.187223319 +0100
@@ -1279,11 +1279,15 @@ sdbout_parms (tree parms)
   the parm with the variable's declared type, and adjust
   the address if the least significant bytes (which we are
   using) are not the first ones.  */
+   scalar_mode from_mode, to_mode;
if (BYTES_BIG_ENDIAN
-   && TREE_TYPE (parms) != DECL_ARG_TYPE (parms))
- current_sym_value +=
-   (GET_MODE_SIZE (TYPE_MODE (DECL_ARG_TYPE (parms)))
-- GET_MODE_SIZE (GET_MODE (DECL_RTL (parms;
+   && TREE_TYPE (parms) != DECL_ARG_TYPE (parms)
+   && is_a  (TYPE_MODE (DECL_ARG_TYPE (parms)),
+  _mode)
+   && is_a  (GET_MODE (DECL_RTL (parms)),
+  _mode))
+ current_sym_value += (GET_MODE_SIZE (from_mode)
+   - GET_MODE_SIZE (to_mode));
 
if (MEM_P (DECL_RTL (parms))
&& GET_CODE (XEXP (DECL_RTL (parms), 0)) == PLUS


[73/77] Pass scalar_mode to scalar_mode_supported_p

2017-07-13 Thread Richard Sandiford
This patch makes the preferred_simd_mode target hook take a scalar_mode
rather than a machine_mode.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* target.def (preferred_simd_mode): Take a scalar_mode
instead of a machine_mode.
* targhooks.h (default_preferred_simd_mode): Likewise.
* targhooks.c (default_preferred_simd_mode): Likewise.
* config/aarch64/aarch64.c (aarch64_preferred_simd_mode): Likewise.
* config/arc/arc.c (arc_preferred_simd_mode): Likewise.
* config/arm/arm.c (arm_preferred_simd_mode): Likewise.
* config/c6x/c6x.c (c6x_preferred_simd_mode): Likewise.
* config/epiphany/epiphany.c (epiphany_preferred_simd_mode): Likewise.
* config/i386/i386.c (ix86_preferred_simd_mode): Likewise.
* config/mips/mips.c (mips_preferred_simd_mode): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_preferred_simd_mode):
Likewise.
* config/rs6000/rs6000.c (rs6000_preferred_simd_mode): Likewise.
* config/s390/s390.c (s390_preferred_simd_mode): Likewise.
* config/sparc/sparc.c (sparc_preferred_simd_mode): Likewise.
* doc/tm.texi: Regenerate.
* optabs-query.c (can_vec_mask_load_store_p): Return false for
non-scalar modes.

Index: gcc/target.def
===
--- gcc/target.def  2017-07-13 09:18:57.574337591 +0100
+++ gcc/target.def  2017-07-13 09:18:58.501271737 +0100
@@ -1848,7 +1848,7 @@ mode @var{mode}.  The default is\n\
 equal to @code{word_mode}, because the vectorizer can do some\n\
 transformations even in absence of specialized @acronym{SIMD} hardware.",
  machine_mode,
- (machine_mode mode),
+ (scalar_mode mode),
  default_preferred_simd_mode)
 
 /* Returns a mask of vector sizes to iterate over when auto-vectorizing
Index: gcc/targhooks.h
===
--- gcc/targhooks.h 2017-07-13 09:18:57.575337520 +0100
+++ gcc/targhooks.h 2017-07-13 09:18:58.502271667 +0100
@@ -100,7 +100,7 @@ extern bool default_builtin_vector_align
 default_builtin_support_vector_misalignment (machine_mode mode,
 const_tree,
 int, bool);
-extern machine_mode default_preferred_simd_mode (machine_mode mode);
+extern machine_mode default_preferred_simd_mode (scalar_mode mode);
 extern unsigned int default_autovectorize_vector_sizes (void);
 extern machine_mode default_get_mask_mode (unsigned, unsigned);
 extern void *default_init_cost (struct loop *);
Index: gcc/targhooks.c
===
--- gcc/targhooks.c 2017-07-13 09:18:57.574337591 +0100
+++ gcc/targhooks.c 2017-07-13 09:18:58.502271667 +0100
@@ -1156,7 +1156,7 @@ default_builtin_support_vector_misalignm
possibly adds/subtracts using bit-twiddling.  */
 
 machine_mode
-default_preferred_simd_mode (machine_mode mode ATTRIBUTE_UNUSED)
+default_preferred_simd_mode (scalar_mode)
 {
   return word_mode;
 }
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-07-13 09:18:57.498343016 +0100
+++ gcc/config/aarch64/aarch64.c2017-07-13 09:18:58.445275699 +0100
@@ -11163,7 +11163,7 @@ aarch64_simd_container_mode (machine_mod
 
 /* Return 128-bit container as the preferred SIMD mode for MODE.  */
 static machine_mode
-aarch64_preferred_simd_mode (machine_mode mode)
+aarch64_preferred_simd_mode (scalar_mode mode)
 {
   return aarch64_simd_container_mode (mode, 128);
 }
Index: gcc/config/arc/arc.c
===
--- gcc/config/arc/arc.c2017-07-13 09:18:30.890502711 +0100
+++ gcc/config/arc/arc.c2017-07-13 09:18:58.446275629 +0100
@@ -328,7 +328,7 @@ arc_vector_mode_supported_p (machine_mod
 /* Implements target hook TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
 
 static machine_mode
-arc_preferred_simd_mode (machine_mode mode)
+arc_preferred_simd_mode (scalar_mode mode)
 {
   switch (mode)
 {
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c2017-07-13 09:18:57.510342160 +0100
+++ gcc/config/arm/arm.c2017-07-13 09:18:58.449275416 +0100
@@ -268,7 +268,7 @@ static bool xscale_sched_adjust_cost (rt
 static bool fa726te_sched_adjust_cost (rtx_insn *, int, rtx_insn *, int *);
 static bool arm_array_mode_supported_p (machine_mode,
unsigned HOST_WIDE_INT);
-static machine_mode arm_preferred_simd_mode (machine_mode);
+static machine_mode arm_preferred_simd_mode (scalar_mode);
 static bool arm_class_likely_spilled_p (reg_class_t);
 static HOST_WIDE_INT 

[72/77] Pass scalar_mode to scalar_mode_supported_p

2017-07-13 Thread Richard Sandiford
This patch makes the scalar_mode_supported_p target hook take a
scalar_mode rather than a machine_mode.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* target.def (scalar_mode_supported_p): Take a scalar_mode
instead of a machine_mode.
* targhooks.h (default_scalar_mode_supported_p): Likewise.
* targhooks.c (default_scalar_mode_supported_p): Likewise.
* config/aarch64/aarch64.c (aarch64_scalar_mode_supported_p): Likewise.
* config/alpha/alpha.c (alpha_scalar_mode_supported_p): Likewise.
* config/arm/arm.c (arm_scalar_mode_supported_p): Likewise.
* config/avr/avr.c (avr_scalar_mode_supported_p): Likewise.
* config/c6x/c6x.c (c6x_scalar_mode_supported_p): Likewise.
* config/i386/i386.c (ix86_scalar_mode_supported_p): Likewise.
* config/ia64/ia64.c (ia64_scalar_mode_supported_p): Likewise.
* config/mips/mips.c (mips_scalar_mode_supported_p): Likewise.
* config/msp430/msp430.c (msp430_scalar_mode_supported_p): Likewise.
* config/pa/pa.c (pa_scalar_mode_supported_p): Likewise.
* config/pdp11/pdp11.c (pdp11_scalar_mode_supported_p): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_scalar_mode_supported_p):
Likewise.
* config/rs6000/rs6000.c (rs6000_scalar_mode_supported_p): Likewise.
* config/s390/s390.c (s390_scalar_mode_supported_p): Likewise.
* config/spu/spu.c (spu_scalar_mode_supported_p): Likewise.
* config/tilegx/tilegx.c (tilegx_scalar_mode_supported_p): Likewise.
* config/tilepro/tilepro.c (tilepro_scalar_mode_supported_p):
Likewise.
* doc/tm.texi: Regenerate.

gcc/c-family/
* c-attribs.c (vector_mode_valid_p) Fold GET_MODE_INNER call
into scalar_mode_supported_p call.
(handle_mode_attribute): Update call to scalar_mode_supported_p.

Index: gcc/target.def
===
--- gcc/target.def  2017-07-13 09:18:51.667770394 +0100
+++ gcc/target.def  2017-07-13 09:18:57.574337591 +0100
@@ -3304,7 +3304,7 @@ The default version of this hook returns
 required to handle the basic C types (as defined by the port).\n\
 Included here are the double-word arithmetic supported by the\n\
 code in @file{optabs.c}.",
- bool, (machine_mode mode),
+ bool, (scalar_mode mode),
  default_scalar_mode_supported_p)
 
 /* Similarly for vector modes.  "Supported" here is less strict.  At
Index: gcc/targhooks.h
===
--- gcc/targhooks.h 2017-07-13 09:18:51.668770318 +0100
+++ gcc/targhooks.h 2017-07-13 09:18:57.575337520 +0100
@@ -71,7 +71,7 @@ extern void default_print_operand_addres
 extern bool default_print_operand_punct_valid_p (unsigned char);
 extern tree default_mangle_assembler_name (const char *);
 
-extern bool default_scalar_mode_supported_p (machine_mode);
+extern bool default_scalar_mode_supported_p (scalar_mode);
 extern bool default_libgcc_floating_mode_supported_p (scalar_float_mode);
 extern opt_scalar_float_mode default_floatn_mode (int, bool);
 extern bool targhook_words_big_endian (void);
Index: gcc/targhooks.c
===
--- gcc/targhooks.c 2017-07-13 09:18:51.667770394 +0100
+++ gcc/targhooks.c 2017-07-13 09:18:57.574337591 +0100
@@ -394,7 +394,7 @@ default_mangle_assembler_name (const cha
supported by optabs.c.  */
 
 bool
-default_scalar_mode_supported_p (machine_mode mode)
+default_scalar_mode_supported_p (scalar_mode mode)
 {
   int precision = GET_MODE_PRECISION (mode);
 
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-07-13 09:18:51.583776726 +0100
+++ gcc/config/aarch64/aarch64.c2017-07-13 09:18:57.498343016 +0100
@@ -15053,7 +15053,7 @@ aarch64_libgcc_floating_mode_supported_p
if MODE is HFmode, and punt to the generic implementation otherwise.  */
 
 static bool
-aarch64_scalar_mode_supported_p (machine_mode mode)
+aarch64_scalar_mode_supported_p (scalar_mode mode)
 {
   return (mode == HFmode
  ? true
Index: gcc/config/alpha/alpha.c
===
--- gcc/config/alpha/alpha.c2017-07-13 09:18:51.585776575 +0100
+++ gcc/config/alpha/alpha.c2017-07-13 09:18:57.499342945 +0100
@@ -688,7 +688,7 @@ resolve_reload_operand (rtx op)
indicates only DFmode.  */
 
 static bool
-alpha_scalar_mode_supported_p (machine_mode mode)
+alpha_scalar_mode_supported_p (scalar_mode mode)
 {
   switch (mode)
 {
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c2017-07-13 09:18:30.892502525 +0100
+++ gcc/config/arm/arm.c

[69/77] Split scalar-only part out of convert_mode

2017-07-13 Thread Richard Sandiford
This patch splits the final scalar-only part of convert_mode out
into its own subroutine and treats the modes as scalar_modes there.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* expr.c (convert_mode): Split scalar handling out into...
(convert_mode_scalar): ...this new function.  Treat the modes
as scalar_modes.

Index: gcc/expr.c
===
--- gcc/expr.c  2017-07-13 09:18:53.997596816 +0100
+++ gcc/expr.c  2017-07-13 09:18:56.007450082 +0100
@@ -102,6 +102,7 @@ static rtx const_vector_from_tree (tree)
 static rtx const_scalar_mask_from_tree (scalar_int_mode, tree);
 static tree tree_expr_size (const_tree);
 static HOST_WIDE_INT int_expr_size (tree);
+static void convert_mode_scalar (rtx, rtx, int);
 
 
 /* This is run to set up which modes can be used
@@ -216,17 +217,7 @@ convert_move (rtx to, rtx from, int unsi
 {
   machine_mode to_mode = GET_MODE (to);
   machine_mode from_mode = GET_MODE (from);
-  int to_real = SCALAR_FLOAT_MODE_P (to_mode);
-  int from_real = SCALAR_FLOAT_MODE_P (from_mode);
-  enum insn_code code;
-  rtx libcall;
-
-  /* rtx code for making an equivalent value.  */
-  enum rtx_code equiv_code = (unsignedp < 0 ? UNKNOWN
- : (unsignedp ? ZERO_EXTEND : SIGN_EXTEND));
 
-
-  gcc_assert (to_real == from_real);
   gcc_assert (to_mode != BLKmode);
   gcc_assert (from_mode != BLKmode);
 
@@ -277,6 +268,28 @@ convert_move (rtx to, rtx from, int unsi
   return;
 }
 
+  convert_mode_scalar (to, from, unsignedp);
+}
+
+/* Like convert_move, but deals only with scalar modes.  */
+
+static void
+convert_mode_scalar (rtx to, rtx from, int unsignedp)
+{
+  /* Both modes should be scalar types.  */
+  scalar_mode from_mode = as_a  (GET_MODE (from));
+  scalar_mode to_mode = as_a  (GET_MODE (to));
+  bool to_real = SCALAR_FLOAT_MODE_P (to_mode);
+  bool from_real = SCALAR_FLOAT_MODE_P (from_mode);
+  enum insn_code code;
+  rtx libcall;
+
+  gcc_assert (to_real == from_real);
+
+  /* rtx code for making an equivalent value.  */
+  enum rtx_code equiv_code = (unsignedp < 0 ? UNKNOWN
+ : (unsignedp ? ZERO_EXTEND : SIGN_EXTEND));
+
   if (to_real)
 {
   rtx value;
@@ -413,7 +426,7 @@ convert_move (rtx to, rtx from, int unsi
   rtx fill_value;
   rtx lowfrom;
   int i;
-  machine_mode lowpart_mode;
+  scalar_mode lowpart_mode;
   int nwords = CEIL (GET_MODE_SIZE (to_mode), UNITS_PER_WORD);
 
   /* Try converting directly if the insn is supported.  */


[70/77] Make expand_fix/float check for scalar modes

2017-07-13 Thread Richard Sandiford
The expand_float code:

  /* Unsigned integer, and no way to convert directly.  Convert as signed,
 then unconditionally adjust the result.  */

and the expand_fix code:

  /* For an unsigned conversion, there is one more way to do it.
 If we have a signed conversion, we generate code that compares
 the real value to the largest representable positive number.  If if
 is smaller, the conversion is done normally.  Otherwise, subtract
 one plus the highest signed number, convert, and add it back.

are restricted to scalars, since the expansion branches on a
comparison of the value.  This patch makes that explicit.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* optabs.c (expand_float): Explicitly check for scalars before
using a branching expansion.
(expand_fix): Likewise.

Index: gcc/optabs.c
===
--- gcc/optabs.c2017-07-13 09:18:53.274650323 +0100
+++ gcc/optabs.c2017-07-13 09:18:56.346425666 +0100
@@ -4635,6 +4635,7 @@ expand_float (rtx to, rtx from, int unsi
 {
   enum insn_code icode;
   rtx target = to;
+  scalar_mode from_mode, to_mode;
   machine_mode fmode, imode;
   bool can_do_signed = false;
 
@@ -4684,7 +4685,10 @@ expand_float (rtx to, rtx from, int unsi
 
   /* Unsigned integer, and no way to convert directly.  Convert as signed,
  then unconditionally adjust the result.  */
-  if (unsignedp && can_do_signed)
+  if (unsignedp
+  && can_do_signed
+  && is_a  (GET_MODE (to), _mode)
+  && is_a  (GET_MODE (from), _mode))
 {
   rtx_code_label *label = gen_label_rtx ();
   rtx temp;
@@ -4694,19 +4698,19 @@ expand_float (rtx to, rtx from, int unsi
 least as wide as the target.  Using FMODE will avoid rounding woes
 with unsigned values greater than the signed maximum value.  */
 
-  FOR_EACH_MODE_FROM (fmode, GET_MODE (to))
-   if (GET_MODE_PRECISION (GET_MODE (from)) < GET_MODE_BITSIZE (fmode)
-   && can_float_p (fmode, GET_MODE (from), 0) != CODE_FOR_nothing)
+  FOR_EACH_MODE_FROM (fmode, to_mode)
+   if (GET_MODE_PRECISION (from_mode) < GET_MODE_BITSIZE (fmode)
+   && can_float_p (fmode, from_mode, 0) != CODE_FOR_nothing)
  break;
 
   if (fmode == VOIDmode)
{
  /* There is no such mode.  Pretend the target is wide enough.  */
- fmode = GET_MODE (to);
+ fmode = to_mode;
 
  /* Avoid double-rounding when TO is narrower than FROM.  */
  if ((significand_size (fmode) + 1)
- < GET_MODE_PRECISION (GET_MODE (from)))
+ < GET_MODE_PRECISION (from_mode))
{
  rtx temp1;
  rtx_code_label *neglabel = gen_label_rtx ();
@@ -4718,7 +4722,7 @@ expand_float (rtx to, rtx from, int unsi
  || GET_MODE (target) != fmode)
target = gen_reg_rtx (fmode);
 
- imode = GET_MODE (from);
+ imode = from_mode;
  do_pending_stack_adjust ();
 
  /* Test whether the sign bit is set.  */
@@ -4758,7 +4762,7 @@ expand_float (rtx to, rtx from, int unsi
   /* If we are about to do some arithmetic to correct for an
 unsigned operand, do it in a pseudo-register.  */
 
-  if (GET_MODE (to) != fmode
+  if (to_mode != fmode
  || !REG_P (to) || REGNO (to) < FIRST_PSEUDO_REGISTER)
target = gen_reg_rtx (fmode);
 
@@ -4769,11 +4773,11 @@ expand_float (rtx to, rtx from, int unsi
 correct its value by 2**bitwidth.  */
 
   do_pending_stack_adjust ();
-  emit_cmp_and_jump_insns (from, const0_rtx, GE, NULL_RTX, GET_MODE (from),
+  emit_cmp_and_jump_insns (from, const0_rtx, GE, NULL_RTX, from_mode,
   0, label);
 
 
-  real_2expN (, GET_MODE_PRECISION (GET_MODE (from)), fmode);
+  real_2expN (, GET_MODE_PRECISION (from_mode), fmode);
   temp = expand_binop (fmode, add_optab, target,
   const_double_from_real_value (offset, fmode),
   target, 0, OPTAB_LIB_WIDEN);
@@ -4901,11 +4905,14 @@ expand_fix (rtx to, rtx from, int unsign
  2^63.  The subtraction of 2^63 should not generate any rounding as it
  simply clears out that bit.  The rest is trivial.  */
 
-  if (unsignedp && GET_MODE_PRECISION (GET_MODE (to)) <= 
HOST_BITS_PER_WIDE_INT)
+  scalar_int_mode to_mode;
+  if (unsignedp
+  && is_a  (GET_MODE (to), _mode)
+  && HWI_COMPUTABLE_MODE_P (to_mode))
 FOR_EACH_MODE_FROM (fmode, GET_MODE (from))
-  if (CODE_FOR_nothing != can_fix_p (GET_MODE (to), fmode, 0, _trunc)
+  if (CODE_FOR_nothing != can_fix_p (to_mode, fmode, 0, _trunc)
  && (!DECIMAL_FLOAT_MODE_P (fmode)
- || GET_MODE_BITSIZE (fmode) > GET_MODE_PRECISION (GET_MODE (to
+ || 

[71/77] Use opt_scalar_mode for mode iterators

2017-07-13 Thread Richard Sandiford
This patch uses opt_scalar_mode when iterating over scalar modes.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* coretypes.h (opt_scalar_mode): New typedef.
* gdbhooks.py (build_pretty_printers): Handle it.
* machmode.h (mode_iterator::get_2xwider): Add overload for
opt_mode.
* emit-rtl.c (init_emit_once): Use opt_scalar_mode when iterating
over scalar modes.
* expr.c (convert_mode_scalar): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs.c (expand_float): Likewise.
(expand_fix): Likewise.
* tree-vect-stmts.c (vectorizable_conversion): Likewise.

gcc/c-family/
* c-common.c (c_common_fixed_point_type_for_size): Use opt_scalar_mode
for the mode iterator.

Index: gcc/coretypes.h
===
--- gcc/coretypes.h 2017-07-13 09:18:53.271650545 +0100
+++ gcc/coretypes.h 2017-07-13 09:18:56.810392248 +0100
@@ -59,6 +59,7 @@ typedef const struct rtx_def *const_rtx;
 class scalar_int_mode;
 class scalar_float_mode;
 template class opt_mode;
+typedef opt_mode opt_scalar_mode;
 typedef opt_mode opt_scalar_int_mode;
 typedef opt_mode opt_scalar_float_mode;
 template class pod_mode;
Index: gcc/gdbhooks.py
===
--- gcc/gdbhooks.py 2017-07-13 09:18:53.273650396 +0100
+++ gcc/gdbhooks.py 2017-07-13 09:18:56.812392104 +0100
@@ -543,7 +543,8 @@ def build_pretty_printer():
 pp.add_printer_for_regex(r'opt_mode<(\S+)>',
  'opt_mode', OptMachineModePrinter)
 pp.add_printer_for_types(['opt_scalar_int_mode',
-  'opt_scalar_float_mode'],
+  'opt_scalar_float_mode',
+  'opt_scalar_mode'],
  'opt_mode', OptMachineModePrinter)
 pp.add_printer_for_regex(r'pod_mode<(\S+)>',
  'pod_mode', MachineModePrinter)
Index: gcc/machmode.h
===
--- gcc/machmode.h  2017-07-13 09:18:53.274650323 +0100
+++ gcc/machmode.h  2017-07-13 09:18:56.812392104 +0100
@@ -836,6 +836,13 @@ is_float_mode (machine_mode mode, T *flo
   /* Set mode iterator *ITER to the mode that is two times wider than the
  current one, if such a mode exists.  */
 
+  template
+  inline void
+  get_2xwider (opt_mode *iter)
+  {
+*iter = GET_MODE_2XWIDER_MODE (**iter);
+  }
+
   inline void
   get_2xwider (machine_mode *iter)
   {
Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  2017-07-13 09:18:54.682546579 +0100
+++ gcc/emit-rtl.c  2017-07-13 09:18:56.811392176 +0100
@@ -5891,6 +5891,7 @@ init_emit_once (void)
   int i;
   machine_mode mode;
   scalar_float_mode double_mode;
+  opt_scalar_mode smode_iter;
 
   /* Initialize the CONST_INT, CONST_WIDE_INT, CONST_DOUBLE,
  CONST_FIXED, and memory attribute hash tables.  */
@@ -6005,62 +6006,66 @@ init_emit_once (void)
   const_tiny_rtx[1][(int) mode] = gen_const_vector (mode, 1);
 }
 
-  FOR_EACH_MODE_IN_CLASS (mode, MODE_FRACT)
+  FOR_EACH_MODE_IN_CLASS (smode_iter, MODE_FRACT)
 {
-  FCONST0 (mode).data.high = 0;
-  FCONST0 (mode).data.low = 0;
-  FCONST0 (mode).mode = mode;
-  const_tiny_rtx[0][(int) mode] = CONST_FIXED_FROM_FIXED_VALUE (
- FCONST0 (mode), mode);
-}
-
-  FOR_EACH_MODE_IN_CLASS (mode, MODE_UFRACT)
-{
-  FCONST0 (mode).data.high = 0;
-  FCONST0 (mode).data.low = 0;
-  FCONST0 (mode).mode = mode;
-  const_tiny_rtx[0][(int) mode] = CONST_FIXED_FROM_FIXED_VALUE (
- FCONST0 (mode), mode);
-}
-
-  FOR_EACH_MODE_IN_CLASS (mode, MODE_ACCUM)
-{
-  FCONST0 (mode).data.high = 0;
-  FCONST0 (mode).data.low = 0;
-  FCONST0 (mode).mode = mode;
-  const_tiny_rtx[0][(int) mode] = CONST_FIXED_FROM_FIXED_VALUE (
- FCONST0 (mode), mode);
+  scalar_mode smode = *smode_iter;
+  FCONST0 (smode).data.high = 0;
+  FCONST0 (smode).data.low = 0;
+  FCONST0 (smode).mode = smode;
+  const_tiny_rtx[0][(int) smode]
+   = CONST_FIXED_FROM_FIXED_VALUE (FCONST0 (smode), smode);
+}
+
+  FOR_EACH_MODE_IN_CLASS (smode_iter, MODE_UFRACT)
+{
+  scalar_mode smode = *smode_iter;
+  FCONST0 (smode).data.high = 0;
+  FCONST0 (smode).data.low = 0;
+  FCONST0 (smode).mode = smode;
+  const_tiny_rtx[0][(int) smode]
+   = CONST_FIXED_FROM_FIXED_VALUE (FCONST0 (smode), smode);
+}
+
+  FOR_EACH_MODE_IN_CLASS (smode_iter, MODE_ACCUM)
+{
+  scalar_mode smode = *smode_iter;
+  FCONST0 (smode).data.high = 0;

[68/77] Use scalar_mode for is_int_mode/is_float_mode pairs

2017-07-13 Thread Richard Sandiford
This patch uses scalar_mode for code that operates only on MODE_INT
and MODE_FLOAT.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* omp-expand.c (expand_omp_atomic): Use is_int_mode, is_float_mode
and scalar_mode.
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Likewise.

Index: gcc/omp-expand.c
===
--- gcc/omp-expand.c2017-06-30 12:50:38.243662675 +0100
+++ gcc/omp-expand.c2017-07-13 09:18:55.598479800 +0100
@@ -6724,17 +6724,18 @@ expand_omp_atomic (struct omp_region *re
   if (exact_log2 (align) >= index)
{
  /* Atomic load.  */
+ scalar_mode smode;
  if (loaded_val == stored_val
- && (GET_MODE_CLASS (TYPE_MODE (type)) == MODE_INT
- || GET_MODE_CLASS (TYPE_MODE (type)) == MODE_FLOAT)
- && GET_MODE_BITSIZE (TYPE_MODE (type)) <= BITS_PER_WORD
+ && (is_int_mode (TYPE_MODE (type), )
+ || is_float_mode (TYPE_MODE (type), ))
+ && GET_MODE_BITSIZE (smode) <= BITS_PER_WORD
  && expand_omp_atomic_load (load_bb, addr, loaded_val, index))
return;
 
  /* Atomic store.  */
- if ((GET_MODE_CLASS (TYPE_MODE (type)) == MODE_INT
-  || GET_MODE_CLASS (TYPE_MODE (type)) == MODE_FLOAT)
- && GET_MODE_BITSIZE (TYPE_MODE (type)) <= BITS_PER_WORD
+ if ((is_int_mode (TYPE_MODE (type), )
+  || is_float_mode (TYPE_MODE (type), ))
+ && GET_MODE_BITSIZE (smode) <= BITS_PER_WORD
  && store_bb == single_succ (load_bb)
  && first_stmt (store_bb) == store
  && expand_omp_atomic_store (load_bb, addr, loaded_val,
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2017-07-13 09:18:54.003596374 +0100
+++ gcc/tree-vect-stmts.c   2017-07-13 09:18:55.599479728 +0100
@@ -8936,18 +8936,16 @@ free_stmt_vec_info (gimple *stmt)
 get_vectype_for_scalar_type_and_size (tree scalar_type, unsigned size)
 {
   tree orig_scalar_type = scalar_type;
-  machine_mode inner_mode = TYPE_MODE (scalar_type);
+  scalar_mode inner_mode;
   machine_mode simd_mode;
-  unsigned int nbytes = GET_MODE_SIZE (inner_mode);
   int nunits;
   tree vectype;
 
-  if (nbytes == 0)
+  if (!is_int_mode (TYPE_MODE (scalar_type), _mode)
+  && !is_float_mode (TYPE_MODE (scalar_type), _mode))
 return NULL_TREE;
 
-  if (GET_MODE_CLASS (inner_mode) != MODE_INT
-  && GET_MODE_CLASS (inner_mode) != MODE_FLOAT)
-return NULL_TREE;
+  unsigned int nbytes = GET_MODE_SIZE (inner_mode);
 
   /* For vector types of elements whose mode precision doesn't
  match their types precision we use a element type of mode


[66/77] Use scalar_mode for constant integers

2017-07-13 Thread Richard Sandiford
This patch treats the mode associated with an integer constant as a
scalar_mode.  We can't use the more natural-sounding scalar_int_mode
because we also use (const_int 0) for bounds-checking modes.  (It might
be worth adding a bounds-specific code instead, but that's for another
day.)

This exposes a latent bug in simplify_immed_subreg, which for
vectors of CONST_WIDE_INTs would pass the vector mode rather than
the element mode to rtx_mode_t.

I think the:

  /* We can get a 0 for an error mark.  */
  || GET_MODE_CLASS (mode) == MODE_VECTOR_INT
  || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT

in immed_double_const is dead.  trunc_int_mode (via gen_int_mode)
would go on to ICE if the mode fitted in a HWI, and surely plenty
of other code would be confused to see a const_int be interpreted
as a vector.  We should instead be using CONST0_RTX (mode) if we
need a safe constant for a particular mode.

We didn't try to make these functions take scalar_mode arguments
because in many cases that would be too invasive at this stage.
Maybe it would become feasible in future.  Also, the long-term
direction should probably be to add modes to constant integers
rather than have then as VOIDmode odd-ones-out.  That would remove
the need for rtx_mode_t and thus remove the question whether they
should use scalar_int_mode, scalar_mode or machine_mode.

The patch also uses scalar_mode for the CONST_DOUBLE handling
in loc_descriptor.  In that case the mode can legitimately be
either floating-point or integral.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* emit-rtl.c (immed_double_const): Use is_a  instead
of separate mode class checks.  Do not allow vector modes here.
(immed_wide_int_const): Use as_a .
* explow.c (trunc_int_for_mode): Likewise.
* rtl.h (wi::int_traits::get_precision): Likewise.
(wi::shwi): Likewise.
(wi::min_value): Likewise.
(wi::max_value): Likewise.
* dwarf2out.c (loc_descriptor): Likewise.
* simplify-rtx.c (simplify_immed_subreg): Fix rtx_mode_t argument
for CONST_WIDE_INT.

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  2017-07-13 09:18:51.646771977 +0100
+++ gcc/emit-rtl.c  2017-07-13 09:18:54.682546579 +0100
@@ -599,7 +599,8 @@ lookup_const_wide_int (rtx wint)
 immed_wide_int_const (const wide_int_ref , machine_mode mode)
 {
   unsigned int len = v.get_len ();
-  unsigned int prec = GET_MODE_PRECISION (mode);
+  /* Not scalar_int_mode because we also allow pointer bound modes.  */
+  unsigned int prec = GET_MODE_PRECISION (as_a  (mode));
 
   /* Allow truncation but not extension since we do not know if the
  number is signed or unsigned.  */
@@ -659,18 +660,10 @@ immed_double_const (HOST_WIDE_INT i0, HO
 (i.e., i1 consists only from copies of the sign bit, and sign
of i0 and i1 are the same), then we return a CONST_INT for i0.
  3) Otherwise, we create a CONST_DOUBLE for i0 and i1.  */
-  if (mode != VOIDmode)
-{
-  gcc_assert (GET_MODE_CLASS (mode) == MODE_INT
- || GET_MODE_CLASS (mode) == MODE_PARTIAL_INT
- /* We can get a 0 for an error mark.  */
- || GET_MODE_CLASS (mode) == MODE_VECTOR_INT
- || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
- || GET_MODE_CLASS (mode) == MODE_POINTER_BOUNDS);
-
-  if (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
-   return gen_int_mode (i0, mode);
-}
+  scalar_mode smode;
+  if (is_a  (mode, )
+  && GET_MODE_BITSIZE (smode) <= HOST_BITS_PER_WIDE_INT)
+return gen_int_mode (i0, mode);
 
   /* If this integer fits in one word, return a CONST_INT.  */
   if ((i1 == 0 && i0 >= 0) || (i1 == ~0 && i0 < 0))
Index: gcc/explow.c
===
--- gcc/explow.c2017-07-13 09:18:51.647771901 +0100
+++ gcc/explow.c2017-07-13 09:18:54.682546579 +0100
@@ -49,14 +49,16 @@ static rtx break_out_memory_refs (rtx);
 HOST_WIDE_INT
 trunc_int_for_mode (HOST_WIDE_INT c, machine_mode mode)
 {
-  int width = GET_MODE_PRECISION (mode);
+  /* Not scalar_int_mode because we also allow pointer bound modes.  */
+  scalar_mode smode = as_a  (mode);
+  int width = GET_MODE_PRECISION (smode);
 
   /* You want to truncate to a _what_?  */
   gcc_assert (SCALAR_INT_MODE_P (mode)
  || POINTER_BOUNDS_MODE_P (mode));
 
   /* Canonicalize BImode to 0 and STORE_FLAG_VALUE.  */
-  if (mode == BImode)
+  if (smode == BImode)
 return c & 1 ? STORE_FLAG_VALUE : 0;
 
   /* Sign-extend for the requested mode.  */
Index: gcc/rtl.h
===
--- gcc/rtl.h   2017-07-13 09:18:51.662770770 +0100
+++ gcc/rtl.h   

[67/77] Use scalar_mode in fixed-value.*

2017-07-13 Thread Richard Sandiford
This patch makes the fixed-value.* routines use scalar_mode.
It would be possible to define special classes for these modes, as for
scalar_int_mode and scalar_float_mode, but at the moment nothing would
benefit from them.  In particular, there's no use case that would help
select between one class for all fixed-point modes versus one class for
fractional modes and one class for accumulator modes.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* fixed-value.h (fixed_from_double_int): Take a scalar_mode
rather than a machine_mode.
(fixed_from_string): Likewise.
(fixed_convert): Likewise.
(fixed_convert_from_int): Likewise.
(fixed_convert_from_real): Likewise.
(real_convert_from_fixed): Likewise.
* fixed-value.c (fixed_from_double_int): Likewise.
(fixed_from_string): Likewise.
(fixed_convert): Likewise.
(fixed_convert_from_int): Likewise.
(fixed_convert_from_real): Likewise.
(real_convert_from_fixed): Likewise.
* config/avr/avr.c (avr_out_round): Use as_a .

Index: gcc/fixed-value.h
===
--- gcc/fixed-value.h   2017-02-23 19:54:20.0 +
+++ gcc/fixed-value.h   2017-07-13 09:18:55.158511776 +0100
@@ -47,14 +47,13 @@ extern rtx const_fixed_from_fixed_value
 
 /* Construct a FIXED_VALUE from a bit payload and machine mode MODE.
The bits in PAYLOAD are sign-extended/zero-extended according to MODE.  */
-extern FIXED_VALUE_TYPE fixed_from_double_int (double_int,
-machine_mode);
+extern FIXED_VALUE_TYPE fixed_from_double_int (double_int, scalar_mode);
 
 /* Return a CONST_FIXED from a bit payload and machine mode MODE.
The bits in PAYLOAD are sign-extended/zero-extended according to MODE.  */
 static inline rtx
 const_fixed_from_double_int (double_int payload,
- machine_mode mode)
+scalar_mode mode)
 {
   return
 const_fixed_from_fixed_value (fixed_from_double_int (payload, mode),
@@ -63,25 +62,25 @@ const_fixed_from_double_int (double_int
 
 /* Initialize from a decimal or hexadecimal string.  */
 extern void fixed_from_string (FIXED_VALUE_TYPE *, const char *,
-  machine_mode);
+  scalar_mode);
 
 /* In tree.c: wrap up a FIXED_VALUE_TYPE in a tree node.  */
 extern tree build_fixed (tree, FIXED_VALUE_TYPE);
 
 /* Extend or truncate to a new mode.  */
-extern bool fixed_convert (FIXED_VALUE_TYPE *, machine_mode,
+extern bool fixed_convert (FIXED_VALUE_TYPE *, scalar_mode,
   const FIXED_VALUE_TYPE *, bool);
 
 /* Convert to a fixed-point mode from an integer.  */
-extern bool fixed_convert_from_int (FIXED_VALUE_TYPE *, machine_mode,
+extern bool fixed_convert_from_int (FIXED_VALUE_TYPE *, scalar_mode,
double_int, bool, bool);
 
 /* Convert to a fixed-point mode from a real.  */
-extern bool fixed_convert_from_real (FIXED_VALUE_TYPE *, machine_mode,
+extern bool fixed_convert_from_real (FIXED_VALUE_TYPE *, scalar_mode,
 const REAL_VALUE_TYPE *, bool);
 
 /* Convert to a real mode from a fixed-point.  */
-extern void real_convert_from_fixed (REAL_VALUE_TYPE *, machine_mode,
+extern void real_convert_from_fixed (REAL_VALUE_TYPE *, scalar_mode,
 const FIXED_VALUE_TYPE *);
 
 /* Compare two fixed-point objects for bitwise identity.  */
Index: gcc/fixed-value.c
===
--- gcc/fixed-value.c   2017-03-28 16:19:28.0 +0100
+++ gcc/fixed-value.c   2017-07-13 09:18:55.158511776 +0100
@@ -86,7 +86,7 @@ check_real_for_fixed_mode (REAL_VALUE_TY
The bits in PAYLOAD are sign-extended/zero-extended according to MODE.  */
 
 FIXED_VALUE_TYPE
-fixed_from_double_int (double_int payload, machine_mode mode)
+fixed_from_double_int (double_int payload, scalar_mode mode)
 {
   FIXED_VALUE_TYPE value;
 
@@ -108,7 +108,7 @@ fixed_from_double_int (double_int payloa
 /* Initialize from a decimal or hexadecimal string.  */
 
 void
-fixed_from_string (FIXED_VALUE_TYPE *f, const char *str, machine_mode mode)
+fixed_from_string (FIXED_VALUE_TYPE *f, const char *str, scalar_mode mode)
 {
   REAL_VALUE_TYPE real_value, fixed_value, base_value;
   unsigned int fbit;
@@ -803,7 +803,7 @@ fixed_compare (int icode, const FIXED_VA
Return true, if !SAT_P and overflow.  */
 
 bool
-fixed_convert (FIXED_VALUE_TYPE *f, machine_mode mode,
+fixed_convert (FIXED_VALUE_TYPE *f, scalar_mode mode,
const FIXED_VALUE_TYPE *a, bool sat_p)
 {
   bool overflow_p = false;
@@ -947,7 +947,7 @@ fixed_convert (FIXED_VALUE_TYPE *f, mach
Return true, if !SAT_P and overflow.  

[65/77] Add a SCALAR_TYPE_MODE macro

2017-07-13 Thread Richard Sandiford
This patch adds a SCALAR_TYPE_MODE macro, along the same lines as
SCALAR_INT_TYPE_MODE and SCALAR_FLOAT_TYPE_MODE.  It also adds
two instances of as_a  to c_common_type, when converting
an unsigned fixed-point SCALAR_TYPE_MODE to the equivalent signed mode.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* tree.h (SCALAR_TYPE_MODE): New macro.
* expr.c (expand_expr_addr_expr_1): Use it.
(expand_expr_real_2): Likewise.
* fold-const.c (fold_convert_const_fixed_from_fixed): Likeise.
(fold_convert_const_fixed_from_int): Likewise.
(fold_convert_const_fixed_from_real): Likewise.
(native_encode_fixed): Likewise
(native_encode_complex): Likewise
(native_encode_vector): Likewise.
(native_interpret_fixed): Likewise.
(native_interpret_real): Likewise.
(native_interpret_complex): Likewise.
(native_interpret_vector): Likewise.
* omp-simd-clone.c (simd_clone_adjust_return_type): Likewise.
(simd_clone_adjust_argument_types): Likewise.
(simd_clone_init_simd_arrays): Likewise.
(simd_clone_adjust): Likewise.
* stor-layout.c (layout_type): Likewise.
* tree.c (build_minus_one_cst): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-inline.c (estimate_move_cost): Likewise.
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Likewise.
* tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise.
(vectorizable_reduction): Likewise.
* tree-vect-patterns.c (vect_recog_widen_mult_pattern): Likewise.
(vect_recog_mixed_size_cond_pattern): Likewise.
(check_bool_pattern): Likewise.
(adjust_bool_pattern): Likewise.
(search_type_for_mask_1): Likewise.
* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (vectorizable_conversion): Likewise.
* ubsan.c (ubsan_encode_value): Likewise.
* varasm.c (output_constant): Likewise.

gcc/c-family/
* c-lex.c (interpret_fixed): Use SCALAR_TYPE_MODE.
* c-common.c (c_build_vec_perm_expr): Likewise.

gcc/c/
* c-typeck.c (build_binary_op): Use SCALAR_TYPE_MODE.
(c_common_type): Likewise.  Use as_a  when setting
m1 and m2 to the signed equivalent of a fixed-point
SCALAR_TYPE_MODE.

gcc/cp/
* typeck.c (cp_build_binary_op): Use SCALAR_TYPE_MODE.

Index: gcc/tree.h
===
--- gcc/tree.h  2017-07-13 09:18:38.668812030 +0100
+++ gcc/tree.h  2017-07-13 09:18:54.005596227 +0100
@@ -1852,6 +1852,8 @@ #define TYPE_MODE_RAW(NODE) (TYPE_CHECK
 #define TYPE_MODE(NODE) \
   (VECTOR_TYPE_P (TYPE_CHECK (NODE)) \
? vector_type_mode (NODE) : (NODE)->type_common.mode)
+#define SCALAR_TYPE_MODE(NODE) \
+  (as_a  (TYPE_CHECK (NODE)->type_common.mode))
 #define SCALAR_INT_TYPE_MODE(NODE) \
   (as_a  (TYPE_CHECK (NODE)->type_common.mode))
 #define SCALAR_FLOAT_TYPE_MODE(NODE) \
Index: gcc/expr.c
===
--- gcc/expr.c  2017-07-13 09:18:53.273650396 +0100
+++ gcc/expr.c  2017-07-13 09:18:53.997596816 +0100
@@ -7758,7 +7758,7 @@ expand_expr_addr_expr_1 (tree exp, rtx t
 The expression is therefore always offset by the size of the
 scalar type.  */
   offset = 0;
-  bitpos = GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (exp)));
+  bitpos = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (TREE_TYPE (exp)));
   inner = TREE_OPERAND (exp, 0);
   break;
 
@@ -9436,7 +9436,7 @@ #define REDUCE_BIT_FIELD(expr)(reduce_b
{
  tree sel_type = TREE_TYPE (treeop2);
  machine_mode vmode
-   = mode_for_vector (TYPE_MODE (TREE_TYPE (sel_type)),
+   = mode_for_vector (SCALAR_TYPE_MODE (TREE_TYPE (sel_type)),
   TYPE_VECTOR_SUBPARTS (sel_type));
  gcc_assert (GET_MODE_CLASS (vmode) == MODE_VECTOR_INT);
  op2 = simplify_subreg (vmode, op2, TYPE_MODE (sel_type), 0);
Index: gcc/fold-const.c
===
--- gcc/fold-const.c2017-07-13 09:18:51.658771072 +0100
+++ gcc/fold-const.c2017-07-13 09:18:53.998596742 +0100
@@ -2058,8 +2058,8 @@ fold_convert_const_fixed_from_fixed (tre
   tree t;
   bool overflow_p;
 
-  overflow_p = fixed_convert (, TYPE_MODE (type), _FIXED_CST (arg1),
- TYPE_SATURATING (type));
+  overflow_p = fixed_convert (, SCALAR_TYPE_MODE (type),
+ _FIXED_CST (arg1), TYPE_SATURATING (type));
   t = build_fixed (type, value);
 
   /* Propagate overflow flags.  */
@@ -2087,7 +2087,7 @@ fold_convert_const_fixed_from_int (tree
   else
 di.high = TREE_INT_CST_ELT (arg1, 1);
 
-  overflow_p = fixed_convert_from_int 

[64/77] Add a scalar_mode class

2017-07-13 Thread Richard Sandiford
This patch adds a scalar_mode class that can hold any scalar mode,
specifically:

  - scalar integers
  - scalar floating-point values
  - scalar fractional modes
  - scalar accumulator modes
  - pointer bounds modes

To start with this patch uses this type for GET_MODE_INNER.
Later patches add more uses.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* coretypes.h (scalar_mode): New class.
* machmode.h (scalar_mode): Likewise.
(scalar_mode::includes_p): New function.
(mode_to_inner): Return a scalar_mode rather than a machine_mode.
* gdbhooks.py (build_pretty_printers): Handle scalar_mode.
* genmodes.c (get_mode_class): Handle remaining scalar modes.
* cfgexpand.c (expand_debug_expr): Use scalar_mode.
* expmed.c (store_bit_field_1): Likewise.
(extract_bit_field_1): Likewise.
* expr.c (write_complex_part): Likewise.
(read_complex_part): Likewise.
(emit_move_complex_push): Likewise.
(expand_expr_real_2): Likewise.
* function.c (assign_parm_setup_reg): Likewise.
(assign_parms_unsplit_complex): Likewise.
* optabs.c (expand_binop): Likewise.
* rtlanal.c (subreg_get_info): Likewise.
* simplify-rtx.c (simplify_immed_subreg): Likewise.
* varasm.c (output_constant_pool_2): Likewise.

Index: gcc/coretypes.h
===
--- gcc/coretypes.h 2017-07-13 09:18:28.587718194 +0100
+++ gcc/coretypes.h 2017-07-13 09:18:53.271650545 +0100
@@ -55,6 +55,7 @@ typedef const struct simple_bitmap_def *
 struct rtx_def;
 typedef struct rtx_def *rtx;
 typedef const struct rtx_def *const_rtx;
+class scalar_mode;
 class scalar_int_mode;
 class scalar_float_mode;
 template class opt_mode;
@@ -317,6 +318,7 @@ #define rtx_insn struct _dont_use_rtx_in
 #define tree union _dont_use_tree_here_ *
 #define const_tree union _dont_use_tree_here_ *
 
+typedef struct scalar_mode scalar_mode;
 typedef struct scalar_int_mode scalar_int_mode;
 typedef struct scalar_float_mode scalar_float_mode;
 
Index: gcc/machmode.h
===
--- gcc/machmode.h  2017-07-13 09:18:41.680558844 +0100
+++ gcc/machmode.h  2017-07-13 09:18:53.274650323 +0100
@@ -410,6 +410,47 @@ scalar_float_mode::includes_p (machine_m
   return SCALAR_FLOAT_MODE_P (m);
 }
 
+/* Represents a machine mode that is known to be scalar.  */
+class scalar_mode
+{
+public:
+  typedef mode_traits::from_int from_int;
+
+  ALWAYS_INLINE scalar_mode () {}
+  ALWAYS_INLINE scalar_mode (from_int m) : m_mode (machine_mode (m)) {}
+  ALWAYS_INLINE scalar_mode (const scalar_int_mode ) : m_mode (m) {}
+  ALWAYS_INLINE scalar_mode (const scalar_float_mode ) : m_mode (m) {}
+  ALWAYS_INLINE scalar_mode (const scalar_int_mode_pod ) : m_mode (m) {}
+  ALWAYS_INLINE operator machine_mode () const { return m_mode; }
+
+  static bool includes_p (machine_mode);
+
+protected:
+  machine_mode m_mode;
+};
+
+/* Return true if M represents some kind of scalar value.  */
+
+inline bool
+scalar_mode::includes_p (machine_mode m)
+{
+  switch (GET_MODE_CLASS (m))
+{
+case MODE_INT:
+case MODE_PARTIAL_INT:
+case MODE_FRACT:
+case MODE_UFRACT:
+case MODE_ACCUM:
+case MODE_UACCUM:
+case MODE_FLOAT:
+case MODE_DECIMAL_FLOAT:
+case MODE_POINTER_BOUNDS:
+  return true;
+default:
+  return false;
+}
+}
+
 /* Return the base GET_MODE_SIZE value for MODE.  */
 
 ALWAYS_INLINE unsigned short
@@ -441,14 +482,15 @@ mode_to_precision (machine_mode mode)
 
 /* Return the base GET_MODE_INNER value for MODE.  */
 
-ALWAYS_INLINE machine_mode
+ALWAYS_INLINE scalar_mode
 mode_to_inner (machine_mode mode)
 {
 #if GCC_VERSION >= 4001
-  return (machine_mode) (__builtin_constant_p (mode)
-? mode_inner_inline (mode) : mode_inner[mode]);
+  return scalar_mode::from_int (__builtin_constant_p (mode)
+   ? mode_inner_inline (mode)
+   : mode_inner[mode]);
 #else
-  return (machine_mode) mode_inner[mode];
+  return scalar_mode::from_int (mode_inner[mode]);
 #endif
 }
 
Index: gcc/gdbhooks.py
===
--- gcc/gdbhooks.py 2017-07-13 09:18:28.587718194 +0100
+++ gcc/gdbhooks.py 2017-07-13 09:18:53.273650396 +0100
@@ -549,7 +549,7 @@ def build_pretty_printer():
  'pod_mode', MachineModePrinter)
 pp.add_printer_for_types(['scalar_int_mode_pod'],
  'pod_mode', MachineModePrinter)
-for mode in 'scalar_int_mode', 'scalar_float_mode':
+for mode in 'scalar_mode', 'scalar_int_mode', 'scalar_float_mode':
 pp.add_printer_for_types([mode], mode, MachineModePrinter)
 
 return pp

[63/77] Simplifications after type switch

2017-07-13 Thread Richard Sandiford
This patch makes a few simplifications after the previous
mechanical machine_mode->scalar_int_mode change.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* expmed.c (extract_high_half): Use scalar_int_mode and remove
assertion.
(expmed_mult_highpart_optab): Likewise.
(expmed_mult_highpart): Likewise.

Index: gcc/expmed.c
===
--- gcc/expmed.c2017-07-13 09:18:51.649771750 +0100
+++ gcc/expmed.c2017-07-13 09:18:52.815684419 +0100
@@ -3611,14 +3611,11 @@ expand_mult_highpart_adjust (scalar_int_
 static rtx
 extract_high_half (scalar_int_mode mode, rtx op)
 {
-  machine_mode wider_mode;
-
   if (mode == word_mode)
 return gen_highpart (mode, op);
 
-  gcc_assert (!SCALAR_FLOAT_MODE_P (mode));
+  scalar_int_mode wider_mode = *GET_MODE_WIDER_MODE (mode);
 
-  wider_mode = *GET_MODE_WIDER_MODE (mode);
   op = expand_shift (RSHIFT_EXPR, wider_mode, op,
 GET_MODE_BITSIZE (mode), 0, 1);
   return convert_modes (mode, wider_mode, op, 0);
@@ -3632,15 +3629,13 @@ expmed_mult_highpart_optab (scalar_int_m
rtx target, int unsignedp, int max_cost)
 {
   rtx narrow_op1 = gen_int_mode (INTVAL (op1), mode);
-  machine_mode wider_mode;
   optab moptab;
   rtx tem;
   int size;
   bool speed = optimize_insn_for_speed_p ();
 
-  gcc_assert (!SCALAR_FLOAT_MODE_P (mode));
+  scalar_int_mode wider_mode = *GET_MODE_WIDER_MODE (mode);
 
-  wider_mode = *GET_MODE_WIDER_MODE (mode);
   size = GET_MODE_BITSIZE (mode);
 
   /* Firstly, try using a multiplication insn that only generates the needed
@@ -3746,7 +3741,6 @@ expmed_mult_highpart_optab (scalar_int_m
 expmed_mult_highpart (scalar_int_mode mode, rtx op0, rtx op1,
  rtx target, int unsignedp, int max_cost)
 {
-  machine_mode wider_mode = *GET_MODE_WIDER_MODE (mode);
   unsigned HOST_WIDE_INT cnst1;
   int extra_cost;
   bool sign_adjust = false;
@@ -3755,7 +3749,6 @@ expmed_mult_highpart (scalar_int_mode mo
   rtx tem;
   bool speed = optimize_insn_for_speed_p ();
 
-  gcc_assert (!SCALAR_FLOAT_MODE_P (mode));
   /* We can't support modes wider than HOST_BITS_PER_INT.  */
   gcc_assert (HWI_COMPUTABLE_MODE_P (mode));
 
@@ -3765,6 +3758,7 @@ expmed_mult_highpart (scalar_int_mode mo
  ??? We might be able to perform double-word arithmetic if
  mode == word_mode, however all the cost calculations in
  synth_mult etc. assume single-word operations.  */
+  scalar_int_mode wider_mode = *GET_MODE_WIDER_MODE (mode);
   if (GET_MODE_BITSIZE (wider_mode) > BITS_PER_WORD)
 return expmed_mult_highpart_optab (mode, op0, op1, target,
   unsignedp, max_cost);


[61/77] Use scalar_int_mode in the AArch64 port

2017-07-13 Thread Richard Sandiford
This patch makes the AArch64 port use scalar_int_mode in various places.
Other ports won't need this kind of change; we only need it for AArch64
because of the variable-sized SVE modes.

The only change in functionality is in the rtx_costs handling
of CONST_INT.  If the caller doesn't supply a mode, we now pass
word_mode rather than VOIDmode to aarch64_internal_mov_immediate.
aarch64_movw_imm will therefore not now truncate large constants
in this situation.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* config/aarch64/aarch64-protos.h (aarch64_is_extend_from_extract):
Take a scalar_int_mode instead of a machine_mode.
(aarch64_mask_and_shift_for_ubfiz_p): Likewise.
(aarch64_move_imm): Likewise.
(aarch64_output_scalar_simd_mov_immediate): Likewise.
(aarch64_simd_scalar_immediate_valid_for_move): Likewise.
(aarch64_simd_attr_length_rglist): Delete.
* config/aarch64/aarch64.c (aarch64_is_extend_from_extract): Take
a scalar_int_mode instead of a machine_mode.
(aarch64_add_offset): Likewise.
(aarch64_internal_mov_immediate): Likewise
(aarch64_add_constant_internal): Likewise.
(aarch64_add_constant): Likewise.
(aarch64_movw_imm): Likewise.
(aarch64_move_imm): Likewise.
(aarch64_rtx_arith_op_extract_p): Likewise.
(aarch64_mask_and_shift_for_ubfiz_p): Likewise.
(aarch64_simd_scalar_immediate_valid_for_move): Likewise.
Remove assert that the mode isn't a vector.
(aarch64_output_scalar_simd_mov_immediate): Likewise.
(aarch64_expand_mov_immediate): Update calls after above changes.
(aarch64_output_casesi): Use as_a .
(aarch64_and_bitmask_imm): Check for scalar integer modes.
(aarch64_strip_extend): Likewise.
(aarch64_extr_rtx_p): Likewise.
(aarch64_rtx_costs): Likewise, using wode_mode as the mode of
a CONST_INT when the mode parameter is VOIDmode.

Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2017-07-05 16:29:19.581861907 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2017-07-13 09:18:50.737840686 +0100
@@ -330,22 +330,21 @@ bool aarch64_function_arg_regno_p (unsig
 bool aarch64_fusion_enabled_p (enum aarch64_fusion_pairs);
 bool aarch64_gen_movmemqi (rtx *);
 bool aarch64_gimple_fold_builtin (gimple_stmt_iterator *);
-bool aarch64_is_extend_from_extract (machine_mode, rtx, rtx);
+bool aarch64_is_extend_from_extract (scalar_int_mode, rtx, rtx);
 bool aarch64_is_long_call_p (rtx);
 bool aarch64_is_noplt_call_p (rtx);
 bool aarch64_label_mentioned_p (rtx);
 void aarch64_declare_function_name (FILE *, const char*, tree);
 bool aarch64_legitimate_pic_operand_p (rtx);
-bool aarch64_mask_and_shift_for_ubfiz_p (machine_mode, rtx, rtx);
+bool aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode, rtx, rtx);
 bool aarch64_modes_tieable_p (machine_mode mode1,
  machine_mode mode2);
 bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
-bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
+bool aarch64_move_imm (HOST_WIDE_INT, scalar_int_mode);
 bool aarch64_mov_operand_p (rtx, machine_mode);
-int aarch64_simd_attr_length_rglist (machine_mode);
 rtx aarch64_reverse_mask (machine_mode);
 bool aarch64_offset_7bit_signed_scaled_p (machine_mode, HOST_WIDE_INT);
-char *aarch64_output_scalar_simd_mov_immediate (rtx, machine_mode);
+char *aarch64_output_scalar_simd_mov_immediate (rtx, scalar_int_mode);
 char *aarch64_output_simd_mov_immediate (rtx, machine_mode, unsigned);
 bool aarch64_pad_arg_upward (machine_mode, const_tree);
 bool aarch64_pad_reg_upward (machine_mode, const_tree, bool);
@@ -355,7 +354,7 @@ bool aarch64_simd_check_vect_par_cnst_ha
bool high);
 bool aarch64_simd_imm_scalar_p (rtx x, machine_mode mode);
 bool aarch64_simd_imm_zero_p (rtx, machine_mode);
-bool aarch64_simd_scalar_immediate_valid_for_move (rtx, machine_mode);
+bool aarch64_simd_scalar_immediate_valid_for_move (rtx, scalar_int_mode);
 bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool);
 bool aarch64_simd_valid_immediate (rtx, machine_mode, bool,
   struct simd_immediate_info *);
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-07-13 09:18:31.691429056 +0100
+++ gcc/config/aarch64/aarch64.c2017-07-13 09:18:50.738840609 +0100
@@ -1216,7 +1216,7 @@ aarch64_is_noplt_call_p (rtx sym)
 
(extract:MODE (mult (reg) (MULT_IMM)) (EXTRACT_IMM) (const_int 0)).  */
 bool
-aarch64_is_extend_from_extract (machine_mode mode, rtx mult_imm,
+aarch64_is_extend_from_extract (scalar_int_mode mode, 

[62/77] Big machine_mode to scalar_int_mode replacement

2017-07-13 Thread Richard Sandiford
This patch changes the types of various things from machine_mode
to scalar_int_mode, in cases where (after previous patches)
simply changing the type is enough on its own.  The patch does
nothing other than that.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* builtins.h (builtin_strncpy_read_str): Take a scalar_int_mode
instead of a machine_mode.
(builtin_memset_read_str): Likewise.
* builtins.c (c_readstr): Likewise.
(builtin_memcpy_read_str): Likewise.
(builtin_strncpy_read_str): Likewise.
(builtin_memset_read_str): Likewise.
(builtin_memset_gen_str): Likewise.
(expand_builtin_signbit): Use scalar_int_mode for local variables.
* cfgexpand.c (convert_debug_memory_address): Take a scalar_int_mode
instead of a machine_mode.
* combine.c (simplify_if_then_else): Use scalar_int_mode for local
variables.
(make_extraction): Likewise.
(try_widen_shift_mode): Take and return scalar_int_modes instead
of machine_modes.
* config/aarch64/aarch64.c (aarch64_libgcc_cmp_return_mode): Return
a scalar_int_mode instead of a machine_mode.
* config/avr/avr.c (avr_addr_space_address_mode): Likewise.
(avr_addr_space_pointer_mode): Likewise.
* config/cr16/cr16.c (cr16_unwind_word_mode): Likewise.
* config/msp430/msp430.c (msp430_addr_space_pointer_mode): Likewise.
(msp430_unwind_word_mode): Likewise.
* config/spu/spu.c (spu_unwind_word_mode): Likewise.
(spu_addr_space_pointer_mode): Likewise.
(spu_addr_space_address_mode): Likewise.
(spu_libgcc_cmp_return_mode): Likewise.
(spu_libgcc_shift_count_mode): Likewise.
* config/rl78/rl78.c (rl78_addr_space_address_mode): Likewise.
(rl78_addr_space_pointer_mode): Likewise.
(fl78_unwind_word_mode): Likewise.
(rl78_valid_pointer_mode): Take a scalar_int_mode instead of a
machine_mode.
* config/alpha/alpha.c (vms_valid_pointer_mode): Likewise.
* config/ia64/ia64.c (ia64_vms_valid_pointer_mode): Likewise.
* config/mips/mips.c (mips_mode_rep_extended): Likewise.
(mips_valid_pointer_mode): Likewise.
* config/tilegx/tilegx.c (tilegx_mode_rep_extended): Likewise.
* config/ft32/ft32.c (ft32_valid_pointer_mode): Likewise.
(ft32_addr_space_pointer_mode): Return a scalar_int_mode instead
of a machine_mode.
(ft32_addr_space_address_mode): Likewise.
* config/m32c/m32c.c (m32c_valid_pointer_mode): Take a
scalar_int_mode instead of a machine_mode.
(m32c_addr_space_pointer_mode): Return a scalar_int_mode instead
of a machine_mode.
(m32c_addr_space_address_mode): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_abi_word_mode): Likewise.
(rs6000_eh_return_filter_mode): Likewise.
* config/rs6000/rs6000.c (rs6000_abi_word_mode): Likewise.
(rs6000_eh_return_filter_mode): Likewise.
* config/s390/s390.c (s390_libgcc_cmp_return_mode): Likewise.
(s390_libgcc_shift_count_mode): Likewise.
(s390_unwind_word_mode): Likewise.
(s390_valid_pointer_mode): Take a scalar_int_mode rather than a
machine_mode.
* target.def (mode_rep_extended): Likewise.
(valid_pointer_mode): Likewise.
(addr_space.valid_pointer_mode): Likewise.
(eh_return_filter_mode): Return a scalar_int_mode rather than
a machine_mode.
(libgcc_cmp_return_mode): Likewise.
(libgcc_shift_count_mode): Likewise.
(unwind_word_mode): Likewise.
(addr_space.pointer_mode): Likewise.
(addr_space.address_mode): Likewise.
* doc/tm.texi: Regenerate.
* dojump.c (prefer_and_bit_test): Take a scalar_int_mode rather than
a machine_mode.
(do_jump): Use scalar_int_mode for local variables.
* dwarf2cfi.c (init_return_column_size): Take a scalar_int_mode
rather than a machine_mode.
* dwarf2out.c (convert_descriptor_to_mode): Likewise.
(scompare_loc_descriptor_wide): Likewise.
(scompare_loc_descriptor_narrow): Likewise.
* emit-rtl.c (adjust_address_1): Use scalar_int_mode for local
variables.
* except.c (sjlj_emit_dispatch_table): Likewise.
(expand_builtin_eh_copy_values): Likewise.
* explow.c (convert_memory_address_addr_space_1): Likewise.
Take a scalar_int_mode rather than a machine_mode.
(convert_memory_address_addr_space): Take a scalar_int_mode rather
than a machine_mode.
(memory_address_addr_space): Use scalar_int_mode for local variables.
* expmed.h (expand_mult_highpart_adjust): Take a scalar_int_mode
rather than a machine_mode.
* expmed.c (mask_rtx): Likewise.
 

[59/77] Add a rtx_jump_table_data::get_data_mode helper

2017-07-13 Thread Richard Sandiford
This patch adds a helper function to get the mode of the addresses
or offsets in a jump table.  It also changes the final.c code to use
rtx_jump_table_data over rtx or rtx_insn in cases where it needed
to use the new helper.  This in turn meant adding a safe_dyn_cast
equivalent of safe_as_a, to cope with null NEXT_INSNs.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* is-a.h (safe_dyn_cast): New function.
* rtl.h (rtx_jump_table_data::get_data_mode): New function.
(jump_table_for_label): Likewise.
* final.c (final_addr_vec_align): Take an rtx_jump_table_data *
instead of an rtx_insn *.
(shorten_branches): Use dyn_cast instead of LABEL_P and
JUMP_TABLE_DATA_P.  Use jump_table_for_label and
rtx_jump_table_data::get_data_mode.
(final_scan_insn): Likewise.

Index: gcc/is-a.h
===
--- gcc/is-a.h  2017-02-23 19:54:02.0 +
+++ gcc/is-a.h  2017-07-13 09:18:49.947900834 +0100
@@ -103,6 +103,11 @@ TYPE dyn_cast  (pointer)
 Note that we have converted two sets of assertions in the calls to varpool
 into safe and efficient use of a variable.
 
+TYPE safe_dyn_cast  (pointer)
+
+Like dyn_cast  (pointer), except that it accepts null pointers
+and returns null results for them.
+
 
 If you use these functions and get a 'inline function not defined' or a
 'missing symbol' error message for 'is_a_helper<>::test', it means that
@@ -222,4 +227,13 @@ dyn_cast (U *p)
 return static_cast  (0);
 }
 
+/* Similar to dyn_cast, except that the pointer may be null.  */
+
+template 
+inline T
+safe_dyn_cast (U *p)
+{
+  return p ? dyn_cast  (p) : 0;
+}
+
 #endif  /* GCC_IS_A_H  */
Index: gcc/rtl.h
===
--- gcc/rtl.h   2017-07-13 09:18:46.226190608 +0100
+++ gcc/rtl.h   2017-07-13 09:18:49.948900757 +0100
@@ -634,6 +634,7 @@ class GTY(()) rtx_jump_table_data : publ
  This method gets the underlying vec.  */
 
   inline rtvec get_labels () const;
+  inline scalar_int_mode get_data_mode () const;
 };
 
 class GTY(()) rtx_barrier : public rtx_insn
@@ -1477,6 +1478,24 @@ inline rtvec rtx_jump_table_data::get_la
 return XVEC (pat, 1); /* presumably an ADDR_DIFF_VEC */
 }
 
+/* Return the mode of the data in the table, which is always a scalar
+   integer.  */
+
+inline scalar_int_mode
+rtx_jump_table_data::get_data_mode () const
+{
+  return as_a  (GET_MODE (PATTERN (this)));
+}
+
+/* If LABEL is followed by a jump table, return the table, otherwise
+   return null.  */
+
+inline rtx_jump_table_data *
+jump_table_for_label (const rtx_code_label *label)
+{
+  return safe_dyn_cast  (NEXT_INSN (label));
+}
+
 #define RTX_FRAME_RELATED_P(RTX)   \
   (RTL_FLAG_CHECK6 ("RTX_FRAME_RELATED_P", (RTX), DEBUG_INSN, INSN,\
CALL_INSN, JUMP_INSN, BARRIER, SET)->frame_related)
Index: gcc/final.c
===
--- gcc/final.c 2017-06-07 07:42:15.423295833 +0100
+++ gcc/final.c 2017-07-13 09:18:49.947900834 +0100
@@ -217,9 +217,6 @@ static void leaf_renumber_regs (rtx_insn
 #if HAVE_cc0
 static int alter_cond (rtx);
 #endif
-#ifndef ADDR_VEC_ALIGN
-static int final_addr_vec_align (rtx_insn *);
-#endif
 static int align_fuzz (rtx, rtx, int, unsigned);
 static void collect_fn_hard_reg_usage (void);
 static tree get_call_fndecl (rtx_insn *);
@@ -517,9 +514,9 @@ default_jump_align_max_skip (rtx_insn *i
 
 #ifndef ADDR_VEC_ALIGN
 static int
-final_addr_vec_align (rtx_insn *addr_vec)
+final_addr_vec_align (rtx_jump_table_data *addr_vec)
 {
-  int align = GET_MODE_SIZE (GET_MODE (PATTERN (addr_vec)));
+  int align = GET_MODE_SIZE (addr_vec->get_data_mode ());
 
   if (align > BIGGEST_ALIGNMENT / BITS_PER_UNIT)
 align = BIGGEST_ALIGNMENT / BITS_PER_UNIT;
@@ -936,45 +933,41 @@ #define MAX_CODE_ALIGN 16
   if (INSN_P (insn))
continue;
 
-  if (LABEL_P (insn))
+  if (rtx_code_label *label = dyn_cast  (insn))
{
- rtx_insn *next;
- bool next_is_jumptable;
-
  /* Merge in alignments computed by compute_alignments.  */
- log = LABEL_TO_ALIGNMENT (insn);
+ log = LABEL_TO_ALIGNMENT (label);
  if (max_log < log)
{
  max_log = log;
- max_skip = LABEL_TO_MAX_SKIP (insn);
+ max_skip = LABEL_TO_MAX_SKIP (label);
}
 
- next = next_nonnote_insn (insn);
- next_is_jumptable = next && JUMP_TABLE_DATA_P (next);
- if (!next_is_jumptable)
+ rtx_jump_table_data *table = jump_table_for_label (label);
+ if (!table)
{
- log = LABEL_ALIGN (insn);
+ log = LABEL_ALIGN (label);
  if (max_log < log)
  

[60/77] Pass scalar_int_modes to do_jump_by_parts_*

2017-07-13 Thread Richard Sandiford
The callers of do_jump_by_parts_* had already established
that the modes were MODE_INTs, so this patch passes the
modes down as scalar_int_modes.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* dojump.c (do_jump_by_parts_greater_rtx): Change the type of
the mode argument to scalar_int_mode.
(do_jump_by_parts_zero_rtx): Likewise.
(do_jump_by_parts_equality_rtx): Likewise.
(do_jump_by_parts_greater): Take a mode argument.
(do_jump_by_parts_equality): Likewise.
(do_jump_1): Update calls accordingly.

Index: gcc/dojump.c
===
--- gcc/dojump.c2017-07-13 09:18:38.657812970 +0100
+++ gcc/dojump.c2017-07-13 09:18:50.326871958 +0100
@@ -38,11 +38,12 @@ Software Foundation; either version 3, o
 #include "langhooks.h"
 
 static bool prefer_and_bit_test (machine_mode, int);
-static void do_jump_by_parts_greater (tree, tree, int,
+static void do_jump_by_parts_greater (scalar_int_mode, tree, tree, int,
  rtx_code_label *, rtx_code_label *,
  profile_probability);
-static void do_jump_by_parts_equality (tree, tree, rtx_code_label *,
-  rtx_code_label *, profile_probability);
+static void do_jump_by_parts_equality (scalar_int_mode, tree, tree,
+  rtx_code_label *, rtx_code_label *,
+  profile_probability);
 static void do_compare_and_jump(tree, tree, enum rtx_code, enum 
rtx_code,
 rtx_code_label *, rtx_code_label *,
 profile_probability);
@@ -221,8 +222,8 @@ do_jump_1 (enum tree_code code, tree op0
   prob.invert ());
else if (is_int_mode (TYPE_MODE (inner_type), _mode)
 && !can_compare_p (EQ, int_mode, ccp_jump))
- do_jump_by_parts_equality (op0, op1, if_false_label, if_true_label,
-prob);
+ do_jump_by_parts_equality (int_mode, op0, op1, if_false_label,
+if_true_label, prob);
 else
  do_compare_and_jump (op0, op1, EQ, EQ, if_false_label, if_true_label,
   prob);
@@ -242,8 +243,8 @@ do_jump_1 (enum tree_code code, tree op0
  do_jump (op0, if_false_label, if_true_label, prob);
else if (is_int_mode (TYPE_MODE (inner_type), _mode)
 && !can_compare_p (NE, int_mode, ccp_jump))
- do_jump_by_parts_equality (op0, op1, if_true_label, if_false_label,
-prob.invert ());
+ do_jump_by_parts_equality (int_mode, op0, op1, if_true_label,
+if_false_label, prob.invert ());
 else
  do_compare_and_jump (op0, op1, NE, NE, if_false_label, if_true_label,
   prob);
@@ -254,7 +255,7 @@ do_jump_1 (enum tree_code code, tree op0
   mode = TYPE_MODE (TREE_TYPE (op0));
   if (is_int_mode (mode, _mode)
  && ! can_compare_p (LT, int_mode, ccp_jump))
-   do_jump_by_parts_greater (op0, op1, 1, if_false_label,
+   do_jump_by_parts_greater (int_mode, op0, op1, 1, if_false_label,
  if_true_label, prob);
   else
do_compare_and_jump (op0, op1, LT, LTU, if_false_label, if_true_label,
@@ -265,8 +266,8 @@ do_jump_1 (enum tree_code code, tree op0
   mode = TYPE_MODE (TREE_TYPE (op0));
   if (is_int_mode (mode, _mode)
  && ! can_compare_p (LE, int_mode, ccp_jump))
-   do_jump_by_parts_greater (op0, op1, 0, if_true_label, if_false_label,
- prob.invert ());
+   do_jump_by_parts_greater (int_mode, op0, op1, 0, if_true_label,
+ if_false_label, prob.invert ());
   else
do_compare_and_jump (op0, op1, LE, LEU, if_false_label, if_true_label,
 prob);
@@ -276,7 +277,7 @@ do_jump_1 (enum tree_code code, tree op0
   mode = TYPE_MODE (TREE_TYPE (op0));
   if (is_int_mode (mode, _mode)
  && ! can_compare_p (GT, int_mode, ccp_jump))
-   do_jump_by_parts_greater (op0, op1, 0, if_false_label,
+   do_jump_by_parts_greater (int_mode, op0, op1, 0, if_false_label,
  if_true_label, prob);
   else
do_compare_and_jump (op0, op1, GT, GTU, if_false_label, if_true_label,
@@ -287,8 +288,8 @@ do_jump_1 (enum tree_code code, tree op0
   mode = TYPE_MODE (TREE_TYPE (op0));
   if (is_int_mode (mode, _mode)
  && ! can_compare_p (GE, int_mode, ccp_jump))
-   do_jump_by_parts_greater (op0, op1, 1, if_true_label, if_false_label,
- prob.invert ());
+   

[58/77] Use scalar_int_mode in a try_combine optimisation

2017-07-13 Thread Richard Sandiford
This patch uses scalar_int_modes for:

  /* If I2 is setting a pseudo to a constant and I3 is setting some
 sub-part of it to another constant, merge them by making a new
 constant.  */

This was already implicit, but the danger with checking only
CONST_SCALAR_INT_P is that it can include CC values too.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* combine.c (try_combine): Use is_a  when
trying to combine a full-register integer set with a subreg
integer set.

Index: gcc/combine.c
===
--- gcc/combine.c   2017-07-13 09:18:45.761227536 +0100
+++ gcc/combine.c   2017-07-13 09:18:49.526933168 +0100
@@ -2645,6 +2645,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
   rtx other_pat = 0;
   rtx new_other_notes;
   int i;
+  scalar_int_mode dest_mode, temp_mode;
 
   /* Immediately return if any of I0,I1,I2 are the same insn (I3 can
  never be).  */
@@ -2847,33 +2848,40 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
  constant.  */
   if (i1 == 0
   && (temp_expr = single_set (i2)) != 0
+  && is_a  (GET_MODE (SET_DEST (temp_expr)), _mode)
   && CONST_SCALAR_INT_P (SET_SRC (temp_expr))
   && GET_CODE (PATTERN (i3)) == SET
   && CONST_SCALAR_INT_P (SET_SRC (PATTERN (i3)))
   && reg_subword_p (SET_DEST (PATTERN (i3)), SET_DEST (temp_expr)))
 {
   rtx dest = SET_DEST (PATTERN (i3));
+  rtx temp_dest = SET_DEST (temp_expr);
   int offset = -1;
   int width = 0;
-  
+
   if (GET_CODE (dest) == ZERO_EXTRACT)
{
  if (CONST_INT_P (XEXP (dest, 1))
- && CONST_INT_P (XEXP (dest, 2)))
+ && CONST_INT_P (XEXP (dest, 2))
+ && is_a  (GET_MODE (XEXP (dest, 0)),
+_mode))
{
  width = INTVAL (XEXP (dest, 1));
  offset = INTVAL (XEXP (dest, 2));
  dest = XEXP (dest, 0);
  if (BITS_BIG_ENDIAN)
-   offset = GET_MODE_PRECISION (GET_MODE (dest)) - width - offset;
+   offset = GET_MODE_PRECISION (dest_mode) - width - offset;
}
}
   else
{
  if (GET_CODE (dest) == STRICT_LOW_PART)
dest = XEXP (dest, 0);
- width = GET_MODE_PRECISION (GET_MODE (dest));
- offset = 0;
+ if (is_a  (GET_MODE (dest), _mode))
+   {
+ width = GET_MODE_PRECISION (dest_mode);
+ offset = 0;
+   }
}
 
   if (offset >= 0)
@@ -2882,9 +2890,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
  if (subreg_lowpart_p (dest))
;
  /* Handle the case where inner is twice the size of outer.  */
- else if (GET_MODE_PRECISION (GET_MODE (SET_DEST (temp_expr)))
-  == 2 * GET_MODE_PRECISION (GET_MODE (dest)))
-   offset += GET_MODE_PRECISION (GET_MODE (dest));
+ else if (GET_MODE_PRECISION (temp_mode)
+  == 2 * GET_MODE_PRECISION (dest_mode))
+   offset += GET_MODE_PRECISION (dest_mode);
  /* Otherwise give up for now.  */
  else
offset = -1;
@@ -2895,23 +2903,22 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
  rtx inner = SET_SRC (PATTERN (i3));
  rtx outer = SET_SRC (temp_expr);
 
- wide_int o
-   = wi::insert (rtx_mode_t (outer, GET_MODE (SET_DEST (temp_expr))),
- rtx_mode_t (inner, GET_MODE (dest)),
- offset, width);
+ wide_int o = wi::insert (rtx_mode_t (outer, temp_mode),
+  rtx_mode_t (inner, dest_mode),
+  offset, width);
 
  combine_merges++;
  subst_insn = i3;
  subst_low_luid = DF_INSN_LUID (i2);
  added_sets_2 = added_sets_1 = added_sets_0 = 0;
- i2dest = SET_DEST (temp_expr);
+ i2dest = temp_dest;
  i2dest_killed = dead_or_set_p (i2, i2dest);
 
  /* Replace the source in I2 with the new constant and make the
 resulting insn the new pattern for I3.  Then skip to where we
 validate the pattern.  Everything was set up above.  */
  SUBST (SET_SRC (temp_expr),
-immed_wide_int_const (o, GET_MODE (SET_DEST (temp_expr;
+immed_wide_int_const (o, temp_mode));
 
  newpat = PATTERN (i2);
 


[56/77] Use the more specific type when two modes are known to be equal

2017-07-13 Thread Richard Sandiford
This patch adjusts a couple of cases in which we had established
that two modes were equal and happened to be using the one with the
more general type instead of the one with the more specific type.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* expr.c (expand_expr_real_2): Use word_mode instead of innermode
when the two are known to be equal.

Index: gcc/expr.c
===
--- gcc/expr.c  2017-07-13 09:18:47.609081780 +0100
+++ gcc/expr.c  2017-07-13 09:18:48.752992795 +0100
@@ -8671,7 +8671,7 @@ #define REDUCE_BIT_FIELD(expr)(reduce_b
  rtx htem, hipart;
  op0 = expand_normal (treeop0);
  if (TREE_CODE (treeop1) == INTEGER_CST)
-   op1 = convert_modes (innermode, mode,
+   op1 = convert_modes (word_mode, mode,
 expand_normal (treeop1),
 TYPE_UNSIGNED (TREE_TYPE (treeop1)));
  else
@@ -8682,8 +8682,8 @@ #define REDUCE_BIT_FIELD(expr)(reduce_b
goto widen_mult_const;
  temp = expand_binop (mode, other_optab, op0, op1, target,
   unsignedp, OPTAB_LIB_WIDEN);
- hipart = gen_highpart (innermode, temp);
- htem = expand_mult_highpart_adjust (innermode, hipart,
+ hipart = gen_highpart (word_mode, temp);
+ htem = expand_mult_highpart_adjust (word_mode, hipart,
  op0, op1, hipart,
  zextend_p);
  if (htem != hipart)


[57/77] Use scalar_int_mode in expand_expr_addr_expr

2017-07-13 Thread Richard Sandiford
This patch rewrites the condition:

  if (tmode != address_mode && tmode != pointer_mode)
tmode = address_mode;

to the equivalent:

  tmode == pointer_mode ? pointer_mode : address_mode

The latter has the advantage that the result is naturally
a scalar_int_mode; a later mechanical patch makes it one.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* expr.c (expand_expr_addr_expr): Add a new_tmode local variable
that is always either address_mode or pointer_mode.

Index: gcc/expr.c
===
--- gcc/expr.c  2017-07-13 09:18:48.752992795 +0100
+++ gcc/expr.c  2017-07-13 09:18:49.132963429 +0100
@@ -7902,20 +7902,21 @@ expand_expr_addr_expr (tree exp, rtx tar
   /* We can get called with some Weird Things if the user does silliness
  like "(short) ".  In that case, convert_memory_address won't do
  the right thing, so ignore the given target mode.  */
-  if (tmode != address_mode && tmode != pointer_mode)
-tmode = address_mode;
+  machine_mode new_tmode = (tmode == pointer_mode
+   ? pointer_mode
+   : address_mode);
 
   result = expand_expr_addr_expr_1 (TREE_OPERAND (exp, 0), target,
-   tmode, modifier, as);
+   new_tmode, modifier, as);
 
   /* Despite expand_expr claims concerning ignoring TMODE when not
  strictly convenient, stuff breaks if we don't honor it.  Note
  that combined with the above, we only do this for pointer modes.  */
   rmode = GET_MODE (result);
   if (rmode == VOIDmode)
-rmode = tmode;
-  if (rmode != tmode)
-result = convert_memory_address_addr_space (tmode, result, as);
+rmode = new_tmode;
+  if (rmode != new_tmode)
+result = convert_memory_address_addr_space (new_tmode, result, as);
 
   return result;
 }


[55/77] Use scalar_int_mode in simplify_const_unary_operation

2017-07-13 Thread Richard Sandiford
The main scalar integer block in simplify_const_unary_operation
had the condition:

  if (CONST_SCALAR_INT_P (op) && width > 0)

where "width > 0" was a roundabout way of testing != VOIDmode.
This patch replaces it with a check for a scalar_int_mode instead.
It also uses the number of bits in the input rather than the output
mode to determine the result of a "count ... bits in zero" operation.
(At the momemnt these modes have to be the same, but it still seems
conceptually wrong to use the number of bits in the output mode.)

The handling of float->integer ops also checked "width > 0",
but this was redundant with the earlier check for MODE_INT.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* simplify-rtx.c (simplify_const_unary_operation): Use
is_a  instead of checking for a nonzero
precision.  Forcibly convert op_mode to a scalar_int_mode
in that case.  More clearly differentiate the operand and
result modes and use the former when deciding what the value
of a count-bits operation should be.  Use is_int_mode instead
of checking for a MODE_INT.  Remove redundant check for whether
this mode has a zero precision.

Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  2017-07-13 09:18:39.593733475 +0100
+++ gcc/simplify-rtx.c  2017-07-13 09:18:48.356023575 +0100
@@ -1685,7 +1685,7 @@ simplify_unary_operation_1 (enum rtx_cod
 simplify_const_unary_operation (enum rtx_code code, machine_mode mode,
rtx op, machine_mode op_mode)
 {
-  unsigned int width = GET_MODE_PRECISION (mode);
+  scalar_int_mode result_mode;
 
   if (code == VEC_DUPLICATE)
 {
@@ -1800,10 +1800,13 @@ simplify_const_unary_operation (enum rtx
   return const_double_from_real_value (d, mode);
 }
 
-  if (CONST_SCALAR_INT_P (op) && width > 0)
+  if (CONST_SCALAR_INT_P (op) && is_a  (mode, _mode))
 {
+  unsigned int width = GET_MODE_PRECISION (result_mode);
   wide_int result;
-  machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+  scalar_int_mode imode = (op_mode == VOIDmode
+  ? result_mode
+  : as_a  (op_mode));
   rtx_mode_t op0 = rtx_mode_t (op, imode);
   int int_value;
 
@@ -1832,35 +1835,35 @@ simplify_const_unary_operation (enum rtx
  break;
 
case FFS:
- result = wi::shwi (wi::ffs (op0), mode);
+ result = wi::shwi (wi::ffs (op0), result_mode);
  break;
 
case CLZ:
  if (wi::ne_p (op0, 0))
int_value = wi::clz (op0);
- else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, int_value))
-   int_value = GET_MODE_PRECISION (mode);
- result = wi::shwi (int_value, mode);
+ else if (! CLZ_DEFINED_VALUE_AT_ZERO (imode, int_value))
+   int_value = GET_MODE_PRECISION (imode);
+ result = wi::shwi (int_value, result_mode);
  break;
 
case CLRSB:
- result = wi::shwi (wi::clrsb (op0), mode);
+ result = wi::shwi (wi::clrsb (op0), result_mode);
  break;
 
case CTZ:
  if (wi::ne_p (op0, 0))
int_value = wi::ctz (op0);
- else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, int_value))
-   int_value = GET_MODE_PRECISION (mode);
- result = wi::shwi (int_value, mode);
+ else if (! CTZ_DEFINED_VALUE_AT_ZERO (imode, int_value))
+   int_value = GET_MODE_PRECISION (imode);
+ result = wi::shwi (int_value, result_mode);
  break;
 
case POPCOUNT:
- result = wi::shwi (wi::popcount (op0), mode);
+ result = wi::shwi (wi::popcount (op0), result_mode);
  break;
 
case PARITY:
- result = wi::shwi (wi::parity (op0), mode);
+ result = wi::shwi (wi::parity (op0), result_mode);
  break;
 
case BSWAP:
@@ -1881,7 +1884,7 @@ simplify_const_unary_operation (enum rtx
  return 0;
}
 
-  return immed_wide_int_const (result, mode);
+  return immed_wide_int_const (result, result_mode);
 }
 
   else if (CONST_DOUBLE_AS_FLOAT_P (op) 
@@ -1941,9 +1944,9 @@ simplify_const_unary_operation (enum rtx
 }
   else if (CONST_DOUBLE_AS_FLOAT_P (op)
   && SCALAR_FLOAT_MODE_P (GET_MODE (op))
-  && GET_MODE_CLASS (mode) == MODE_INT
-  && width > 0)
+  && is_int_mode (mode, _mode))
 {
+  unsigned int width = GET_MODE_PRECISION (result_mode);
   /* Although the overflow semantics of RTL's FIX and UNSIGNED_FIX
 operators are intentionally left unspecified (to ease implementation
 by target backends), for consistency, this routine implements the


[54/77] Add explicit int checks for alternative optab implementations

2017-07-13 Thread Richard Sandiford
expand_unop can expand narrow clz, clrsb, ctz, bswap, parity and
ffs operations using optabs for wider modes.  These expansions
apply only to scalar integer modes (and not for example to vectors),
so the patch adds explicit checks for that.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* optabs.c (widen_leading): Change the type of the mode argument
to scalar_int_mode.  Use opt_scalar_int_mode for the mode iterator.
(widen_bswap): Likewise.
(expand_parity): Likewise.
(expand_ctz): Change the type of the mode argument to scalar_int_mode.
(expand_ffs): Likewise.
(epand_unop): Check for scalar integer modes before calling the
above routines.

Index: gcc/optabs.c
===
--- gcc/optabs.c2017-07-13 09:18:46.703152916 +0100
+++ gcc/optabs.c2017-07-13 09:18:48.024049315 +0100
@@ -2128,39 +2128,36 @@ expand_simple_unop (machine_mode mode, e
A similar operation can be used for clrsb.  UNOPTAB says which operation
we are trying to expand.  */
 static rtx
-widen_leading (machine_mode mode, rtx op0, rtx target, optab unoptab)
+widen_leading (scalar_int_mode mode, rtx op0, rtx target, optab unoptab)
 {
-  enum mode_class mclass = GET_MODE_CLASS (mode);
-  if (CLASS_HAS_WIDER_MODES_P (mclass))
+  opt_scalar_int_mode wider_mode_iter;
+  FOR_EACH_WIDER_MODE (wider_mode_iter, mode)
 {
-  machine_mode wider_mode;
-  FOR_EACH_WIDER_MODE (wider_mode, mode)
+  scalar_int_mode wider_mode = *wider_mode_iter;
+  if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
{
- if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
-   {
- rtx xop0, temp;
- rtx_insn *last;
+ rtx xop0, temp;
+ rtx_insn *last;
 
- last = get_last_insn ();
+ last = get_last_insn ();
 
- if (target == 0)
-   target = gen_reg_rtx (mode);
- xop0 = widen_operand (op0, wider_mode, mode,
-   unoptab != clrsb_optab, false);
- temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX,
- unoptab != clrsb_optab);
- if (temp != 0)
-   temp = expand_binop
- (wider_mode, sub_optab, temp,
-  gen_int_mode (GET_MODE_PRECISION (wider_mode)
-- GET_MODE_PRECISION (mode),
-wider_mode),
-  target, true, OPTAB_DIRECT);
- if (temp == 0)
-   delete_insns_since (last);
+ if (target == 0)
+   target = gen_reg_rtx (mode);
+ xop0 = widen_operand (op0, wider_mode, mode,
+   unoptab != clrsb_optab, false);
+ temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX,
+ unoptab != clrsb_optab);
+ if (temp != 0)
+   temp = expand_binop
+ (wider_mode, sub_optab, temp,
+  gen_int_mode (GET_MODE_PRECISION (wider_mode)
+- GET_MODE_PRECISION (mode),
+wider_mode),
+  target, true, OPTAB_DIRECT);
+ if (temp == 0)
+   delete_insns_since (last);
 
- return temp;
-   }
+ return temp;
}
 }
   return 0;
@@ -2294,22 +2291,20 @@ expand_doubleword_parity (machine_mode m
as
(lshiftrt:wide (bswap:wide x) ((width wide) - (width narrow))).  */
 static rtx
-widen_bswap (machine_mode mode, rtx op0, rtx target)
+widen_bswap (scalar_int_mode mode, rtx op0, rtx target)
 {
-  enum mode_class mclass = GET_MODE_CLASS (mode);
-  machine_mode wider_mode;
   rtx x;
   rtx_insn *last;
+  opt_scalar_int_mode wider_mode_iter;
 
-  if (!CLASS_HAS_WIDER_MODES_P (mclass))
-return NULL_RTX;
+  FOR_EACH_WIDER_MODE (wider_mode_iter, mode)
+if (optab_handler (bswap_optab, *wider_mode_iter) != CODE_FOR_nothing)
+  break;
 
-  FOR_EACH_WIDER_MODE (wider_mode, mode)
-if (optab_handler (bswap_optab, wider_mode) != CODE_FOR_nothing)
-  goto found;
-  return NULL_RTX;
+  if (!wider_mode_iter.exists ())
+return NULL_RTX;
 
- found:
+  scalar_int_mode wider_mode = *wider_mode_iter;
   last = get_last_insn ();
 
   x = widen_operand (op0, wider_mode, mode, true, true);
@@ -2360,42 +2355,40 @@ expand_doubleword_bswap (machine_mode mo
 /* Try calculating (parity x) as (and (popcount x) 1), where
popcount can also be done in a wider mode.  */
 static rtx
-expand_parity (machine_mode mode, rtx op0, rtx target)
+expand_parity (scalar_int_mode mode, rtx op0, rtx target)
 {
   enum mode_class mclass = GET_MODE_CLASS (mode);
-  if (CLASS_HAS_WIDER_MODES_P (mclass))
+  opt_scalar_int_mode wider_mode_iter;
+  

[53/77] Pass a mode to const_scalar_mask_from_tree

2017-07-13 Thread Richard Sandiford
The caller of const_scalar_mask_from_tree has proven that
the mode is a MODE_INT, so this patch passes it down as a
scalar_int_mode.  It also expands the comment a little.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* expr.c (const_scalar_mask_from_tree): Add a mode argument.
Expand commentary.
(expand_expr_real_1): Update call accordingly.

Index: gcc/expr.c
===
--- gcc/expr.c  2017-07-13 09:18:46.702152995 +0100
+++ gcc/expr.c  2017-07-13 09:18:47.609081780 +0100
@@ -99,7 +99,7 @@ static void emit_single_push_insn (machi
 static void do_tablejump (rtx, machine_mode, rtx, rtx, rtx,
  profile_probability);
 static rtx const_vector_from_tree (tree);
-static rtx const_scalar_mask_from_tree (tree);
+static rtx const_scalar_mask_from_tree (scalar_int_mode, tree);
 static tree tree_expr_size (const_tree);
 static HOST_WIDE_INT int_expr_size (tree);
 
@@ -9962,7 +9962,7 @@ expand_expr_real_1 (tree exp, rtx target
if (is_int_mode (mode, _mode))
  {
if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (exp)))
- return const_scalar_mask_from_tree (exp);
+ return const_scalar_mask_from_tree (int_mode, exp);
else
  {
tree type_for_mode
@@ -11717,12 +11717,12 @@ const_vector_mask_from_tree (tree exp)
   return gen_rtx_CONST_VECTOR (mode, v);
 }
 
-/* Return a CONST_INT rtx representing vector mask for
-   a VECTOR_CST of booleans.  */
+/* EXP is a VECTOR_CST in which each element is either all-zeros or all-ones.
+   Return a constant scalar rtx of mode MODE in which bit X is set if element
+   X of EXP is nonzero.  */
 static rtx
-const_scalar_mask_from_tree (tree exp)
+const_scalar_mask_from_tree (scalar_int_mode mode, tree exp)
 {
-  machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
   wide_int res = wi::zero (GET_MODE_PRECISION (mode));
   tree elt;
   unsigned i;


[52/77] Use scalar_int_mode in extract/store_bit_field

2017-07-13 Thread Richard Sandiford
After a certain point, extract_bit_field and store_bit_field
ensure that they're dealing with integer modes or BLKmode MEMs.
This patch uses scalar_int_mode and opt_scalar_int_mode for
those parts.

2017-07-13  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* expmed.c (store_bit_field_using_insv): Add op0_mode and
value_mode arguments.  Use scalar_int_mode internally.
(store_bit_field_1): Rename the new integer mode from imode
to op0_mode and use it instead of GET_MODE (op0).  Update calls
to store_split_bit_field, store_bit_field_using_insv and
store_fixed_bit_field.
(store_fixed_bit_field): Add op0_mode and value_mode arguments.
Use scalar_int_mode internally.  Use a bit count rather than a mode
when calculating the largest bit size for get_best_mode.
Update calls to store_split_bit_field and store_fixed_bit_field_1.
(store_fixed_bit_field_1): Add mode and value_mode arguments.
Remove assertion that OP0 has a scalar integer mode.
(store_split_bit_field): Add op0_mode and value_mode arguments.
Update calls to extract_fixed_bit_field.
(extract_bit_field_using_extv): Add an op0_mode argument.
Use scalar_int_mode internally.
(extract_bit_field_1): Rename the new integer mode from imode to
op0_mode and use it instead of GET_MODE (op0).  Update calls to
extract_split_bit_field, extract_bit_field_using_extv and
extract_fixed_bit_field.
(extract_fixed_bit_field): Add an op0_mode argument.  Update calls
to extract_split_bit_field and extract_fixed_bit_field_1.
(extract_fixed_bit_field_1): Add a mode argument.  Remove assertion
that OP0 has a scalar integer mode.  Use as_a 
on the target mode.
(extract_split_bit_field): Add an op0_mode argument.  Update call
to extract_fixed_bit_field.

Index: gcc/expmed.c
===
--- gcc/expmed.c2017-07-13 09:18:46.700153153 +0100
+++ gcc/expmed.c2017-07-13 09:18:47.197114027 +0100
@@ -45,27 +45,31 @@ struct target_expmed default_target_expm
 struct target_expmed *this_target_expmed = _target_expmed;
 #endif
 
-static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
+static void store_fixed_bit_field (rtx, opt_scalar_int_mode,
   unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
-  rtx, bool);
-static void store_fixed_bit_field_1 (rtx, unsigned HOST_WIDE_INT,
+  unsigned HOST_WIDE_INT,
+  rtx, scalar_int_mode, bool);
+static void store_fixed_bit_field_1 (rtx, scalar_int_mode,
+unsigned HOST_WIDE_INT,
 unsigned HOST_WIDE_INT,
-rtx, bool);
-static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
+rtx, scalar_int_mode, bool);
+static void store_split_bit_field (rtx, opt_scalar_int_mode,
   unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
   unsigned HOST_WIDE_INT,
-  rtx, bool);
-static rtx extract_fixed_bit_field (machine_mode, rtx,
+  unsigned HOST_WIDE_INT,
+  rtx, scalar_int_mode, bool);
+static rtx extract_fixed_bit_field (machine_mode, rtx, opt_scalar_int_mode,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, rtx, int, bool);
-static rtx extract_fixed_bit_field_1 (machine_mode, rtx,
+static rtx extract_fixed_bit_field_1 (machine_mode, rtx, scalar_int_mode,
  unsigned HOST_WIDE_INT,
  unsigned HOST_WIDE_INT, rtx, int, bool);
 static rtx lshift_value (machine_mode, unsigned HOST_WIDE_INT, int);
-static rtx extract_split_bit_field (rtx, unsigned HOST_WIDE_INT,
+static rtx extract_split_bit_field (rtx, opt_scalar_int_mode,
+   unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool);
 static void do_cmp_and_jump (rtx, rtx, enum rtx_code, machine_mode, 
rtx_code_label *);
 static rtx expand_smod_pow2 (machine_mode, rtx, HOST_WIDE_INT);
@@ -568,13 +572,16 @@ simple_mem_bitfield_p (rtx op0, unsigned
 }
 
 /* Try to use instruction INSV to store VALUE into a field of OP0.
-   BITSIZE and BITNUM are as for store_bit_field.  */
+   If OP0_MODE is defined, it is the mode of OP0, otherwise OP0 is a
+   BLKmode MEM.  

  1   2   >