date:20161202

Re: [PR middle-end/78548] fix latent double free in tree-ssa-uninit.c

2016-12-02 Thread Aldy Hernandez


On 12/01/2016 07:32 AM, Richard Biener wrote:

On Thu, Dec 1, 2016 at 12:03 PM, Aldy Hernandez  wrote:

This looks like a latent problem in the -Wmaybe-uninitialized code unrelated
to my work.

The problem here is a sequence of simplify_preds() followed by
normalize_preds() that I added, but is based on prior art all over the file:

+  simplify_preds (&uninit_preds, NULL, false);
+  uninit_preds = normalize_preds (uninit_preds, NULL, false);

The problem is that in this particular testcase we trigger simplify_preds_4
which makes a copy of a chain, frees the chain, and then tries to use the
chain later (in normalize_preds).  The normalize_preds() function tries to
free again the chain and we blow up:

This is the main problem in simplify_preds_4:

  /* Now clean up the chain.  */
  if (simplified)
{
  for (i = 0; i < n; i++)
{
  if ((*preds)[i].is_empty ())
continue;
  s_preds.safe_push ((*preds)[i]);
// 
// Makes a copy of the pred_chain.
}

  destroy_predicate_vecs (preds);
// ^^^
// free() all the pred_chain's.

  (*preds) = s_preds;
// ^^
// Wait a minute, we still keep a copy of the pred_chains.
  s_preds = vNULL;
}

I have no idea how this worked even before my patch.  Perhaps we never had a
simplify_preds() followed by a normalize_preds() where the simplification
involved a call to simplify_preds_4.

Interestingly enough, simplify_preds_2() has the exact same code, but with
the fix I am proposing. So this seems like an oversight.  Also, the fact
that the simplification in simplify_preds_2 is more common than the
simplification performed in simplify_preds_4 would suggest that
simplify_preds_4 was uncommon enough and probably wasn't been used in a
simplify_preds/normalize_preds combo.

Anyways... I've made some other cleanups to the code, but the main gist of
the entire patch is:

-  destroy_predicate_vecs (preds);
+  preds->release ();

That is, release preds, but don't free the associated memory with the
pred_chain's therein.

This patch is on top of my pending patch here:

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02900.html

Tested on x86-64 Linux.

OK for trunk?


Ok.


Thank you Richard.

Since this is dependent on the other patch I will wait until said patch 
is approved to commit.


Aldy

Re: [PATCH] correct handling of non-constant width and precision (pr 78521)

2016-12-02 Thread Rainer Orth

Hi Martin,

> PR 78521 notes that the gimple-ssa-sprintf pass doesn't do the right
> thing (i.e., the -Wformat-length and -fprintf-return-value options
> behave incorrectly) when a conversion specification includes a width
> or precision with a non-constant value.  The code treats such cases
> as if they were not provided which is incorrect and results in
> the wrong bytes counts in warning messages and in the wrong ranges
> being generated for such calls (or in the case sprintf(0, 0, ...)
> for some such calls being eliminated).
>
> The attached patch corrects the handling of these cases, plus a couple
> of other edge cases in the same area: it adjusts the parser to accept
> precision in the form of just a period with no asterisk or decimal
> digits after it (this sets the precision to zero), and corrects the
> handling of zero precision and zero argument in integer directives
> to produce no bytes on output.
>
> Finally, the patch also tightens up the constraint on the upper bound
> of bounded functions like snprintf to be INT_MAX.  The functions cannot
> produce output in excess of INT_MAX + 1 bytes and some implementations
> (e.g., Solaris) fail with EINVAL when the bound is INT_MAX or more.
> This is the subject of PR 78520.

this patch broke Solaris bootstrap:

/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function 'void 
{anonymous}::get_width_and_precision(const {anonymous}::conversion_spec&, long 
long int*, long long int*)':
/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: error: call of 
overloaded 'abs(long long int)' is ambiguous
  width = abs (tree_to_shwi (spec.star_width));
 ^
/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: note: candidates 
are:
In file included from /usr/include/stdlib.h:12:0,
 from /vol/gcc/src/hg/trunk/local/gcc/system.h:258,
 from /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:49:
/usr/include/iso/stdlib_iso.h:205:16: note: long int std::abs(long int)
  inline long   abs(long _l) { return labs(_l); }
^
/usr/include/iso/stdlib_iso.h:160:12: note: int std::abs(int)
 extern int abs(int);
^

The following patch fixed this for me, but I've no idea if it's right.
It bootstrapped successfully on sparc-sun-solaris2.12,
i386-pc-solaris2.12, and x86_64-pc-linux-gnu.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2016-12-02  Rainer Orth  

* gimple-ssa-sprintf.c (get_width_and_precision): Use std::abs.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -774,7 +774,7 @@ get_width_and_precision (const conversio
   if (spec.star_width)
 {
   if (TREE_CODE (spec.star_width) == INTEGER_CST)
-	width = abs (tree_to_shwi (spec.star_width));
+	width = std::abs (tree_to_shwi (spec.star_width));
   else
 	width = HOST_WIDE_INT_MIN;
 }

Re: [tree-tailcall] Check if function returns it's argument

2016-12-02 Thread Richard Biener

On Thu, 1 Dec 2016, Jeff Law wrote:

> On 12/01/2016 06:22 AM, Richard Biener wrote:
> > > Well after removing DECL_BY_REFERENCE, verify_gimple still fails but
> > > differently:
> > > 
> > > tail-padding1.C:13:1: error: RESULT_DECL should be read only when
> > > DECL_BY_REFERENCE is set
> > >  }
> > >  ^
> > > while verifying SSA_NAME nrvo_4 in statement
> > > # .MEM_3 = VDEF <.MEM_1(D)>
> > > nrvo_4 = __builtin_memset (nrvo_2(D), 0, 8);
> > > tail-padding1.C:13:1: internal compiler error: verify_ssa failed
> > 
> > Hmm, ok.  Not sure why we enforce this.
> I don't know either.  But I would start by looking at tree-nrv.c since it
> looks (based on the variable names) that the named-value-return optimization
> kicked in.
> 
> > 
> > Note that in the end this patch looks fishy -- iff we really need
> > the LHS on the assignment for correctness if we have the tailcall
> > flag set then what guarantees that later passes do not remove it
> > again?  So anybody removing a LHS would need to unset the tailcall flag?
> > 
> > Saying again that I don't know enough about the RTL part of tailcall
> > expansion.
> The LHS on the assignment makes it easier to identify when a tail call is
> possible.  It's not needed for correctness.  Not having the LHS on the
> assignment just means we won't get an optimized tail call.
> 
> Under what circumstances would the LHS possibly be removed?  We know the
> return statement references the LHS, so it's not going to be something that
> DCE will do.

Well, I thought Prathamesh added the optimization to copy-propagate
the lhs from the returned argument.  So we'd have both transforms here.

Of course as always the user could have written the code in this way.

If the LHS is not required for correctness then I don't think we need
to put it there - Pratamesh verified the call is tail-called already
if marked by the tailcall pass, even if the LHS is not present.

Richard.

Re: [PATCH] S/390: Fix litpool-r3-1.c.

2016-12-02 Thread Andreas Krebbel

On Wed, Nov 23, 2016 at 12:27:30PM +0100, Dominik Vogt wrote:
> gcc/ChangeLog-lp1
> 
>   * gcc.target/s390/litpool-r3-1.c: Fix label number test.

Applied. Thanks!

-Andreas-

Re: [PATCH v3] Do not simplify "(and (reg) (const bit))" to if_then_else.

2016-12-02 Thread Andreas Krebbel

On Thu, Dec 01, 2016 at 04:30:08PM +0100, Dominik Vogt wrote:
> gcc/ChangeLog
> 
>   * combine.c (combine_simplify_rtx):  Suppress replacement of
>   "(and (reg) (const_int bit))" with "if_then_else".

Applied. Thanks!

-Andreas-

Re: [PATCH 2/2 v3] PR77822

2016-12-02 Thread Andreas Krebbel

On Fri, Nov 25, 2016 at 09:25:59AM +0100, Dominik Vogt wrote:
> gcc/ChangeLogb
> 
>   PR target/77822
>   * config/s390/s390.md ("extzv")
>   ("*extzv")
>   ("*extzvdi_lshiftrt")
>   ("*_ior_and_sr_ze")
>   ("*extract1bitdi")
>   ("*insv", "*insv_rnsbg_noshift")
>   ("*insv_rnsbg_srl", "*insv_mem_reg")
>   ("*insvdi_mem_reghigh", "*insvdi_reg_imm"): Use EXTRACT_ARGS_IN_RANGE
>   to validate the arguments of zero_extract and sign_extract.
> gcc/testsuite/ChangeLogb
> 
>   PR target/77822
>   * gcc.target/s390/s390.exp: Support .cxx tests.
>   * gcc.target/s390/pr77822-2.c: New test.
>   * gcc.target/s390/pr77822-1.cxx: New test.

Applied with testcase .cxx renamed to .C.

-Andreas-

Re: [PATCH 1/2 v3] PR77822

2016-12-02 Thread Andreas Krebbel

On Fri, Nov 25, 2016 at 09:29:46AM +0100, Dominik Vogt wrote:
> gcc/ChangeLog
> 
>   PR target/77822
>   * rtl.h (EXTRACT_ARGS_IN_RANGE): New.

Applied.  Thanks!

-Andreas-

Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c

2016-12-02 Thread Richard Biener

On Thu, Dec 1, 2016 at 6:28 PM, Jeff Law  wrote:
> On 12/01/2016 05:04 AM, Segher Boessenkool wrote:
>>
>> On Thu, Dec 01, 2016 at 10:19:42AM +0100, Richard Biener wrote:
>
> Thinking about this again maybe targets w/o insn-length should simply
> always use the 'simple' algorithm instead of the STV one?  At least
> that
> might be what your change effectively does in some way?


 From reading the comments I don't think STC will collapse down into the
 simple algorithm if block copying is disabled.  But Segher would know
 for
 sure.

 WRT the choice of simple vs STC, I doubt it matters much for the
 processors
 in question.
>>>
>>>
>>> I guess STC doesn't make much sense if we can't say anything about BB
>>> sizes.
>>
>>
>> STC tries to make as large as possible consecutive "traces", mainly to
>> help with instruction cache utilization and hit rate etc.  It cannot do
>> a very good job if it isn't allowed to copy blocks.
>>
>> "simple" tries to (dynamically) have as many fall-throughs as possible,
>> i.e. as few jumps as possible.

I hope it special-cases backedges, that is, not make those fallthru.

>>  It never copies code; if that means it
>> has to jump every second insn, so be it.  It provably is within a factor
>> three of optimal (optimal is NP-hard), under a really weak assumption
>> within a factor two, and it does better than that in practice.
>>
>> STC without block copying makes longer traces which is not a good idea
>> for most architectures, only for those that have a short jump that is
>> much shorter than longer jumps (and thus, cannot cover many jump
>> targets).
>>
>> I do not know how STC behaves when it does not know the insn lengths.
>
> mn103 &  m68k are definitely sensitive to jump distances, the former more so
> than the latter.  Some of the others probably are as well.
>
> I think we've probably discussed this more than is really necessary.  We
> just need to pick an alternative and go with it, I think either alternative
> is reasonable (avoid copying when STC has no length information or fall back
> to simple when there is no length information).

The description of both algorithms doesn't make it an obvious pick.  So lets
leave the decision of the algorithm to the target and make STC beave sane
with no length information (whether that means disallowing any copying or
substituting a "default" length is another question -- but obviously it's the
targets fault to not provide lenght information).

Richard.

>
>
> jeff

RE: [PATCH] PR fortran/77505 -- Treat negative character length as LEN=0

2016-12-02 Thread Punnoose, Elizebeth

Thank you Steve.

Thanks,
Elizebeth

-Original Message-
From: Steve Kargl [mailto:s...@troutmask.apl.washington.edu] 
Sent: 02 December 2016 04:54
To: Punnoose, Elizebeth 
Cc: fort...@gcc.gnu.org; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] PR fortran/77505 -- Treat negative character length as 
LEN=0

On Wed, Nov 30, 2016 at 05:13:28AM +, Punnoose, Elizebeth wrote:
> Please excuse the messy formatting in my initial mail.  Resending with 
> proper formatting.
> 
> This patch checks for negative character length in the array 
> constructor, and treats it as LEN=0.
> 
> A warning message is also printed if bounds checking is enabled.
> 
> Bootstrapped and regression tested the patch on x86_64-linux-gnu and 
> aarch64-linux-gnu.
> 

Thanks.  After regression testing on x86_64-*-freebsd, I committed the attached 
patch.  Not sure if the whitespace got messed up by my email agent, but I 
needed to reformat your testcases.  I took the opportunity to rename and 
improve the testcases.  The improvements check that in fact len=0 and that a 
warning is issued. 

Hopefully, you're inclined to submit additional patches in the future.
A few recommendations are to include the text of your ChangeLog entry in body 
of the email, for example,

2016-12-01  Elizebeth Punnoose  

PR fortran/77505
* trans-array.c (trans_array_constructor): Treat negative character
length as LEN = 0.

2016-12-01  Elizebeth Punnoose  

PR fortran/77505
* gfortran.dg/char_length_20.f90: New test.
* gfortran.dg/char_length_21.f90: Ditto.

(Note, 2 spaces before and after your name.)  Then attach the patch to the 
email.  This hopefully will prevent formatting issues with various email 
clients/servers. 

--
Steve

[PATCH, alpha]: Fix invalid RTX sharing

2016-12-02 Thread Uros Bizjak

2016-12-02  Uros Bizjak  

* config/alpha/alpha.md (exception_receiver): Copy
alpha_gp_ave_rtx return value.

Bootstrapped and regression tested on alphaev68-linux-gnu, will commit soon.

Uros.
diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
index 3e4594b..0ed29de 100644
--- a/gcc/config/alpha/alpha.md
+++ b/gcc/config/alpha/alpha.md
@@ -5142,7 +5142,7 @@
   "TARGET_ABI_OSF"
 {
   if (flag_reorder_blocks_and_partition)
-operands[0] = alpha_gp_save_rtx ();
+operands[0] = copy_rtx (alpha_gp_save_rtx ());
   else
 operands[0] = const0_rtx;
 })

Re: [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)

2016-12-02 Thread Richard Biener

On Thu, Dec 1, 2016 at 6:58 PM, Martin Sebor  wrote:
>> Sure - but then you maybe instead want to check for op being in
>> range [0, max-of-signed-type-of-op] instead?  So similar to
>> expr_not_equal_to add a expr_in_range helper?
>>
>> Your function returns true for sizetype vars even if it might be
>> effectively signed, like for
>>
>>  sizetype i_1 = -4;
>>  i_2 = i_1 + 1;
>>
>> operand_unsigned_p (i) returns true.  I suppose you may have

(*)

>> meant
>>
>> +static bool
>> +operand_unsigned_p (tree op)
>> +{
>> +  if (TREE_CODE (op) == SSA_NAME)
>> +{
>> +  gimple *def = SSA_NAME_DEF_STMT (op);
>> +  if (is_gimple_assign (def))
>> +   {
>> + tree_code code = gimple_assign_rhs_code (def);
>> + if (code == NOP_EXPR
>>&& TYPE_PRECISION (TREE_TYPE (op)) > TYPE_PRECISION
>> (TREE_TYPE (gimple_assign_rhs1 (def
>>   return tree_expr_nonnegative_p (gimple_assign_rhs1 (def)));
>> +   }
>> +}
>> +
>> +  return false;
>> +}
>>
>> ?  because only if you do see a cast and that cast is widening from an
>> nonnegative number
>> the final one will be unsigned (if interpreted as signed number).
>
>
> I don't think this is what I want.  Here's a test case that works
> with my function but not with the suggested modification:
>
>char d[4];
>long f (unsigned long i)
>{
>  return __builtin_object_size (d + i + 1, 0);
>}
>
> Here, the size I'm looking for is (at most) 3 no matter what value
> i has.  Am I missing a case where my function will do the wrong
> thing?

See above (*)

I know what you are trying to do (retro-actively make value-ranges have
a split variable / constant part).  But I don't think that doing it the way you
do it is a reasonable maintainable way.  Others may disagree.

>> You might want to use offset_ints here (see mem_ref_offset for example)
>
>
> Okay, I'll see if I can switch to offset_int.
>
>>

 + gimple *def = SSA_NAME_DEF_STMT (off);
 + if (is_gimple_assign (def))
 +   {
 + tree_code code = gimple_assign_rhs_code (def);
 + if (code == PLUS_EXPR)
 +   {
 + /* Handle offset in the form VAR + CST where VAR's
 type
 +is unsigned so the offset must be the greater of
 +OFFRANGE[0] and CST.  This assumes the PLUS_EXPR
 +is in a canonical form with CST second.  */
 + tree rhs2 = gimple_assign_rhs2 (def);

 err, what?  What about overflow?  Aren't you just trying to decompose
 'off' into a variable and a constant part here and somehow extracting a
 range for the variable part?  So why not just do that?
>>>
>>>
>>>
>>> Sorry, what about overflow?
>>>
>>> The purpose of this code is to handle cases of the form
>>>
>>>& PTR [range (MIN, MAX)] + CST
>>
>>
>> what if MAX + CST overflows?
>
>
> The code doesn't look at MAX, only MIN is considered.  It extracts
> both but only actually uses MAX to see if it's dealing with a range
> or a constant.  Does that resolve your concern?

But if MAX overflows then it might be smaller than MIN and thus you
cannot conclude that [min, max] + CST is [min+CST, UNKNWON]
but rather it's [UNKNOWN, UNKNOWN] (if you disregard the actual
value of MAX).

>>>char d[7];
>>>
>>>#define bos(p, t) __builtin_object_size (p, t)
>>>
>>>long f (unsigned i)
>>>{
>>>  if (2 < i) i = 2;
>>>
>>>  char *p = &d[i] + 3;
>>>
>>>  return bos (p, 0);
>>>}
>>
>>
>> I'm sure that doesn't work as you match for PLUS_EXPR.
>
>
> Sorry, I'm not sure what you mean.

I mean that p = &d[i] + 3; is not represented as a PLUS_EXPR
but as a POINTER_PLUS_EXPR.  All pointer arithmetic is using
POINTER_PLUS_EXPR.

>  The above evaluates to 4 with
> the patch because i cannot be less than zero (otherwise &d[i] would
> be invalid/undefined) so the type-0 size we want (the maximum) is
> &d[7] - (&d[0] + 3) or 4.
>
>> Maybe simply ignore VR_ANTI_RANGEs for now then?
>
>
> Yes, that makes sense.
>
>>> The code above is based on the observation that an anti-range is
>>> often used to represent the full subrange of a narrower signed type
>>> like signed char (as ~[128, -129]).  I haven't been able to create
>>> an anti-range like ~[5, 9]. When/how would a range like that come
>>> about (so I can test it and implement the above correctly)?
>>
>>
>> if (a < 4)
>>   if (a > 8)
>> b = a;
>>
>> then b should have ~[5, 9]
>
>
> Right :)  I have figured out by know by know how to create an anti
> range in general.  What I meant is that I haven't had luck creating
> them in a way that the tree-object-size pass could see (I'm guessing
> because EVRP doesn't understand relational expressions).  So given
> this modified example from above:
>
> char d[9];
>
> #define bos(p, t) __builtin_object_size (p, t)
>
> long f (unsigned a)
> {
>unsigned b = 0;
>
>if (a < 4)
>  if (a > 8

[avr,committed]: Fix coding-style glitches in avr.c

2016-12-02 Thread Georg-Johann Lay


Committed rectifications for bunch of coding rule nits as obvious.

Johann

* config/avr/avr.c: Fix coding rule glitches.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 243104)
+++ config/avr/avr.c	(working copy)
@@ -388,7 +388,7 @@ avr_parallel_insn_from_insns (rtx_insn *
If this is the case, fill in the insns from casesi to INSNS[1..5] and
the SImode extension to INSNS[0].  Moreover, extract the operands of
pattern casesi__sequence forged from the sequence to recog_data.  */
-   
+
 static bool
 avr_is_casesi_sequence (basic_block bb, rtx_insn *insn, rtx_insn *insns[6])
 {
@@ -702,7 +702,7 @@ avr_set_core_architecture (void)
   break;
 }
   else if (0 == strcmp (mcu->name, avr_mmcu)
-   // Is this a proper architecture ? 
+   // Is this a proper architecture ?
&& NULL == mcu->macro)
 {
   avr_arch = &avr_arch_types[mcu->arch_id];
@@ -1078,7 +1078,7 @@ avr_set_current_function (tree decl)
 
   if (!STR_PREFIX_P (name, "__vector"))
 warning_at (loc, OPT_Wmisspelled_isr, "%qs appears to be a misspelled "
-   "%s handler, missing __vector prefix", name, isr);
+"%s handler, missing __vector prefix", name, isr);
 }
 
   /* Don't print the above diagnostics more than once.  */
@@ -1163,7 +1163,7 @@ avr_regs_to_save (HARD_REG_SET *set)
   /* Don't record frame pointer registers here.  They are treated
  indivitually in prologue.  */
   && !(frame_pointer_needed
-   && (reg == REG_Y || reg == (REG_Y+1)
+   && (reg == REG_Y || reg == REG_Y + 1
 {
   if (set)
 SET_HARD_REG_BIT (*set, reg);
@@ -1374,7 +1374,7 @@ sequent_regs_live (void)
   else
 cur_seq = 0;
 
-  if (df_regs_ever_live_p (REG_Y+1))
+  if (df_regs_ever_live_p (REG_Y + 1))
 {
   ++live_seq;
   ++cur_seq;
@@ -1807,7 +1807,8 @@ avr_expand_prologue (void)
   avr_prologue_setup_frame (size, set);
 
   if (flag_stack_usage_info)
-current_function_static_stack_size = cfun->machine->stack_usage + INCOMING_FRAME_SP_OFFSET;
+current_function_static_stack_size
+  = cfun->machine->stack_usage + INCOMING_FRAME_SP_OFFSET;
 }
 
 
@@ -1840,9 +1841,9 @@ avr_asm_function_end_prologue (FILE *fil
  avr_outgoing_args_size());
 
   fprintf (file, "/* frame size = " HOST_WIDE_INT_PRINT_DEC " */\n",
- get_frame_size());
+   get_frame_size());
   fprintf (file, "/* stack size = %d */\n",
- cfun->machine->stack_usage);
+   cfun->machine->stack_usage);
   /* Create symbol stack offset here so all functions have it. Add 1 to stack
  usage for offset so that SP + .L__stack_offset = return address.  */
   fprintf (file, ".L__stack_usage = %d\n", cfun->machine->stack_usage);
@@ -2522,7 +2523,7 @@ avr_print_operand_address (FILE *file, m
   rtx x = addr;
   if (GET_CODE (x) == CONST)
 x = XEXP (x, 0);
-  if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x,1)))
+  if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1)))
 {
   /* Assembler gs() will implant word address.  Make offset
  a byte offset inside gs() for assembler.  This is
@@ -2532,14 +2533,14 @@ avr_print_operand_address (FILE *file, m
  from symbol which may not be what the user really wanted.  */
 
   fprintf (file, "gs(");
-  output_addr_const (file, XEXP (x,0));
+  output_addr_const (file, XEXP (x, 0));
   fprintf (file, "+" HOST_WIDE_INT_PRINT_DEC ")",
2 * INTVAL (XEXP (x, 1)));
   if (AVR_3_BYTE_PC)
 if (warning (0, "pointer offset from symbol maybe incorrect"))
   {
 output_addr_const (stderr, addr);
-fprintf(stderr,"\n");
+fprintf (stderr, "\n");
   }
 }
   else
@@ -2617,12 +2618,12 @@ avr_print_operand (FILE *file, rtx x, in
 }
   else if (code == 'E' || code == 'F')
 {
-  rtx op = XEXP(x, 0);
+  rtx op = XEXP (x, 0);
   fprintf (file, "%s", reg_names[REGNO (op) + ef]);
 }
   else if (code == 'I' || code == 'J')
 {
-  rtx op = XEXP(XEXP(x, 0), 0);
+  rtx op = XEXP (XEXP (x, 0), 0);
   fprintf (file, "%s", reg_names[REGNO (op) + ij]);
 }
   else if (REG_P (x))
@@ -2714,12 +2715,12 @@ avr_print_operand (FILE *file, rtx x, in
 }
   else if (GET_CODE (addr) == PLUS)
 {
-  avr_print_operand_address (file, VOIDmode, XEXP (addr,0));
+  avr_print_operand_address (file, VOIDmode, XEXP (addr, 0));
   if (REGNO (XEXP (addr, 0)) == REG_X)
 fatal_insn ("internal compiler error.  Bad

Re: [patch,testsuite,avr]: Filter-out -mmcu= from options for tests that set -mmcu=

2016-12-02 Thread Georg-Johann Lay


On 01.12.2016 17:40, Mike Stump wrote:

On Dec 1, 2016, at 3:54 AM, Georg-Johann Lay  wrote:


This patch moves the compile tests that have a hard coded -mmcu=MCU in their 
dg-options to a new folder.

The exp driver filters out -mmcu= from the command line options that are 
provided by, say, board description files or --tool-opts.

This is needed because otherwise conflicting -mmcu= will FAIL respective test cases 
because of "specified option '-mmcu' more than once" errors from avr-gcc.

Ok for trunk?


So, it would be nice if different ports can use roughly similar schemes to 
handle the same problems.  I think arm is one of the more complex ports at this 
point in this regard with a lot of people and a lot of years time to 
contemplate and implement solutions to the problem.  They in particular don't 
have to move test cases around to handle the difference like this, I think it 
would be best to avoid that requirement if possible.

Glancing around, two starting points for how the arm achieves what it does:

  lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.* conflicts with 
-m(cpu|arch)=.* switch"

in arm.exp, and they use something like:

/* { dg-require-effective-target arm_crypto_ok } */ 
|crypto-vsha256hq_u32.c:2:/* { dg-require-effective-target 
arm_crypto_ok } */
/* { dg-add-options arm_crypto } */ 
|crypto-vsha256su0q_u32.c:2:/* { 
dg-require-effective-target arm_crypto_ok } */

to validate the requirements of the test case, and to ensure that
optional things are selected.  Nice, simple, extensible, handles
multilibs, dejagnu arguments and different cpu defaults as I recall.

You won't need all the hair the arm folks have, but if you stub in
support in that direction, you then have simple, easy expansion room
to match all complexities that the arm folks have already hit and
solved.


I tried this approach, but it does not work as expected.

Notice that avr-gcc throws an error if conflicting -mmcu options are 
supplied.  Pruning the output will make some tests "PASS", but the 
compiler didn't actually do anything but producing an error message...


And one test FAILs because it should produce some specific diagnostic, 
but again the compiler just throws a different error, the output is 
pruned and the expected message is missing, hence fail.


Also one test case is for ATmega8, but you won't run the whole test 
suite against ATmega8 because that device is too restricted to give 
reasonable results...  Hence for some tests -mmcu=atmega8 is added by hand.


Johann

Re: [PATCH] remove invalid "tail +16c"

2016-12-02 Thread ma . jiang

On 11/22/2016 11:22 PM, ma.ji...@zte.com.cn wrote:
> Hi all,
>   In "config/acx.m4", there are still some "tail +16c"  which are 
invalid
> on POSIX systems.
>   In my opinion， all "tail +16c" should be changed to "tail -c +16"
> directly, as most systems has accept the latter.
>   And, to skip first 16 bytes, we should use "tail -c +17" instead of
> "tail -c +16".
>
>  * config/acx.m4:Change "tail +16c" to "tail -c +17".
>  * configure: Regenerate.
>Thanks.  I've installed this on the trunk after bootstrap & regression 
>testing on x86_64-linux-gnu.
>
>jeff
>
>Thanks
Hi Jeff,
  Thanks for your attention. I can now close the bug in 
binuitls---https://sourceware.org/bugzilla/show_bug.cgi?id=20823.

[Committed] S/390: Fix RTL sharing when generating reg note.

2016-12-02 Thread Andreas Krebbel

gcc/ChangeLog:

2016-12-02  Andreas Krebbel  

* config/s390/s390.c (s390_save_gprs_to_fprs): Fix RTL sharing
problem.
---
 gcc/ChangeLog  | 5 +
 gcc/config/s390/s390.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a0cefa7..03387cf 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2016-12-02  Andreas Krebbel  
+
+   * config/s390/s390.c (s390_save_gprs_to_fprs): Fix RTL sharing
+   problem.
+
 2016-12-02  Georg-Johann Lay  
 
* config/avr/avr-arch.h (avr_mcu_t) [n_flash]: Remove field.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 767666e..030e10d 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -10666,7 +10666,7 @@ s390_save_gprs_to_fprs (void)
  /* This prevents dwarf2cfi from interpreting the set.  Doing
 so it might emit def_cfa_register infos setting an FPR as
 new CFA.  */
- add_reg_note (insn, REG_CFA_REGISTER, PATTERN (insn));
+ add_reg_note (insn, REG_CFA_REGISTER, copy_rtx (PATTERN (insn)));
}
 }
 }
-- 
2.9.1

Re: [PATCH GCC]Simplify (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2)

2016-12-02 Thread Bin.Cheng

On Wed, Nov 30, 2016 at 3:10 PM, Richard Biener
 wrote:
> On Fri, Nov 18, 2016 at 5:53 PM, Bin Cheng  wrote:
>> Hi,
>> This is a rework of https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02007.html.
>> Though review comments suggested it could be merged with last kind 
>> simplification
>> of fold_cond_expr_with_comparison, it's not really applicable.  As a matter 
>> of fact,
>> the suggestion stands for patch 
>> @https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html.
>> I had previous patch 
>> (https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html)
>> moving fold_cond_expr_with_comparison to match.pd pattern and incorporated
>> that patch into it.  For this one, the rework is trivial, just renames 
>> several variable
>> tags as suggested.  Bootstrap and test on x86_64 and AArch64, is it OK?
>
> + A) Operand x is a unsigned to signed type conversion and c1 is
> +   integer zero.  In this case,
> + (signed type)x  < 0  <=>  x  > MAX_VAL(signed type)
> + (signed type)x >= 0  <=>  x <= MAX_VAL(signed type)
>
> for (singed type)x < 0 -> x > signed-type-max we probably do a reverse
> "canonicalization" transform?  Yeah,
>
> /* Non-equality compare simplifications from fold_binary  */
> (for cmp (lt gt le ge)
> ...
>  (if (wi::eq_p (@1, signed_max)
>   && TYPE_UNSIGNED (arg1_type)
>   /* We will flip the signedness of the comparison operator
>  associated with the mode of @1, so the sign bit is
>  specified by this mode.  Check that @1 is the signed
>  max associated with this sign bit.  */
>   && prec == GET_MODE_PRECISION (TYPE_MODE (arg1_type))
>   /* signed_type does not work on pointer types.  */
>   && INTEGRAL_TYPE_P (arg1_type))
>   /* The following case also applies to X < signed_max+1
>  and X >= signed_max+1 because previous transformations.  */
>   (if (cmp == LE_EXPR || cmp == GT_EXPR)
>(with { tree st = signed_type_for (arg1_type); }
> (if (cmp == LE_EXPR)
>  (ge (convert:st @0) { build_zero_cst (st); })
>  (lt (convert:st @0) { build_zero_cst (st); }))
>
> +   if (cmp_code == GE_EXPR)
> + cmp_code = LE_EXPR;
> +   c1 = wide_int_to_tree (op_type, wi::max_value (to_type));
> + }
> ...
> +   if (op == PLUS_EXPR)
> + real_c1 = wide_int_to_tree (op_type,
> + wi::sub (c3, c2, sgn, &overflow));
> +   else
> + real_c1 = wide_int_to_tree (op_type,
> + wi::add (c3, c2, sgn, &overflow));
>
> can you avoid the tree building here and just continue using wide-ints please?
> Simply do the wide_int_to_tree in the result patterns.
Hi,
I updated patch wrto your comments, also deleted two useless
variables.  Bootstrap and test, is it OK?

Thanks,
bin

2016-12-01  Bin Cheng  

* match.pd: Add new pattern:
(cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2).

gcc/testsuite/ChangeLog
2016-12-01  Bin Cheng  

* gcc.dg/fold-bopcond-1.c: New test.
* gcc.dg/fold-bopcond-2.c: New test.

>
> Otherwise looks ok to me.
>
> Thanks,
> Richard.
>
>
>> Thanks,
>> bin
>>
>> 2016-11-17  Bin Cheng  
>>
>> * match.pd: Add new pattern:
>> (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2).
>>
>> gcc/testsuite/ChangeLog
>> 2016-11-17  Bin Cheng  
>>
>> * gcc.dg/fold-bopcond-1.c: New test.
>> * gcc.dg/fold-bopcond-2.c: New test.
diff --git a/gcc/match.pd b/gcc/match.pd
index bc8a5e7..dbb9103 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2038,6 +2038,106 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (convert (cond (eq @1 (convert @3))
 (convert:from_type @3) (convert:from_type @2)
 
+/* (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2) if:
+
+ 1) OP is PLUS or MINUS.
+ 2) CMP is LT, LE, GT or GE.
+ 3) C3 == (C1 op C2), and computation doesn't have undefined behavior.
+
+   This pattern also handles special cases like:
+
+ A) Operand x is a unsigned to signed type conversion and c1 is
+   integer zero.  In this case,
+ (signed type)x  < 0  <=>  x  > MAX_VAL(signed type)
+ (signed type)x >= 0  <=>  x <= MAX_VAL(signed type)
+ B) Const c1 may not equal to (C3 op' C2).  In this case we also
+   check equality for (c1+1) and (c1-1) by adjusting comparison
+   code.
+
+   TODO: Though signed type is handled by this pattern, it cannot be
+   simplified at the moment because C standard requires additional
+   type promotion.  In order to match&simplify it here, the IR needs
+   to be cleaned up by other optimizers, i.e, VRP.  */
+(for op (plus minus)
+ (for cmp (lt le gt ge)
+  (simplify
+   (cond (cmp (convert? @X) INTEGER_CST@1) (op @X INTEGER_CST@2) INTEGER_CST@3)
+   (with { tree from_type = TREE_TYPE (@X), to_type = TREE_TYPE (@1); }
+(if (types_match (from_type, to_type

[PATCH] EVRP edge info refactoring and simple improvement

2016-12-02 Thread Richard Biener


The following refactors range extraction from edges and makes EVRP
able to merge such edge based ranges for the case of multiple 
predecessors.  This allows it to catch anti-ranges from
if (a < 4 || a > 8) if that is not re-written to a single test like
when using gotos.

I don't expect this to catch very much in practice but the refactoring
means we can incorporate the tricks from register_edge_assert_for
more easily (we "simply" have to use extract_ranges_from_edge as the
one-and-true kind-of interface).

Motivated by Martin Sebors work on b_o_s and the appearant inability
of EVRP to do anything useful for him.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-12-02  Richard Biener  

* tree-vrp.c (evrp_dom_walker::extract_ranges_from_edge): New method
split out from evrp_dom_walker::before_dom_children.
(evrp_dom_walker::before_dom_children): Use it and implement merging
of edge range info from multiple predecessors.

* gcc.dg/tree-ssa/evrp7.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/evrp7.c
new file mode 100644
index 000..c28347d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp7.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fgimple -fdump-tree-evrp" } */
+
+void __GIMPLE (startwith("evrp"))
+f (unsigned int a)
+{
+  if (a < 4U)
+goto x;
+  else
+goto bb_3;
+
+bb_3:
+  if (a > 8U)
+goto x;
+  else
+goto bb_4;
+
+x:
+  if (a == 5U)
+goto bb_5;
+  else
+goto bb_4;
+
+bb_5:
+  __builtin_abort ();
+
+bb_4:
+  return;
+}
+
+/* { dg-final { scan-tree-dump-times "evrp" 0 "if" } } */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 600634d..592d3b0 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -10700,6 +10700,7 @@ public:
   void push_value_range (tree var, value_range *vr);
   value_range *pop_value_range (tree var);
   value_range *try_find_new_range (tree op, tree_code code, tree limit);
+  void extract_ranges_from_edge (edge, vec > &);
 
   /* Cond_stack holds the old VR.  */
   auto_vec > stack;
@@ -10736,13 +10737,69 @@ evrp_dom_walker::try_find_new_range (tree op, 
tree_code code, tree limit)
   return NULL;
 }
 
+/* Extract ranges on E and store them into V.  */
+
+void
+evrp_dom_walker::extract_ranges_from_edge (edge e,
+  vec > & v)
+{
+  gimple *stmt = last_stmt (e->src);
+  tree op0;
+  if (stmt
+  && gimple_code (stmt) == GIMPLE_COND
+  && (op0 = gimple_cond_lhs (stmt))
+  && TREE_CODE (op0) == SSA_NAME
+  && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)))
+ || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Visiting controlling predicate ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+   }
+  /* Entering a new scope.  Try to see if we can find a VR
+here.  */
+  tree op1 = gimple_cond_rhs (stmt);
+  tree_code code = gimple_cond_code (stmt);
+
+  if (TREE_OVERFLOW_P (op1))
+   op1 = drop_tree_overflow (op1);
+
+  /* If condition is false, invert the cond.  */
+  if (e->flags & EDGE_FALSE_VALUE)
+   code = invert_tree_comparison (gimple_cond_code (stmt),
+  HONOR_NANS (op0));
+  /* Add VR when (OP0 CODE OP1) condition is true.  */
+  value_range *op0_range = try_find_new_range (op0, code, op1);
+
+  /* Register ranges for y in x < y where
+y might have ranges that are useful.  */
+  tree limit;
+  tree_code new_code;
+  if (TREE_CODE (op1) == SSA_NAME
+ && extract_code_and_val_from_cond_with_ops (op1, code,
+ op0, op1,
+ false,
+ &new_code, &limit))
+   {
+ /* Add VR when (OP1 NEW_CODE LIMIT) condition is true.  */
+ value_range *op1_range = try_find_new_range (op1, new_code, limit);
+ if (op1_range)
+   v.safe_push (std::make_pair (op1, op1_range));
+   }
+
+  if (op0_range)
+   v.safe_push (std::make_pair (op0, op0_range));
+}
+}
+
 /* See if there is any new scope is entered with new VR and set that VR to
ssa_name before visiting the statements in the scope.  */
 
 edge
 evrp_dom_walker::before_dom_children (basic_block bb)
 {
-  tree op0 = NULL_TREE;
   edge_iterator ei;
   edge e;
 
@@ -10768,53 +10825,10 @@ evrp_dom_walker::before_dom_children (basic_block bb)
 }
   if (pred_e)
 {
-  gimple *stmt = last_stmt (pred_e->src);
-  if (stmt
- && gimple_code (stmt) == GIMPLE_COND
- && (op0 = gimple_cond_lhs (stmt))
- && TREE_CODE (op0) == SSA_NAME
- && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt)))
- || POINTER_TYPE_P (TREE_TYPE (gimple_cond_l

[PR middle-end/78328] [committed] wrong wording for unbounded alloca case

2016-12-02 Thread Aldy Hernandez

The problem here is that as we follow the cast from an unsigned int to 
__SIZE_TYPE__, we ignore the VR_ANTI_RANGE of 7 exhibited by the test in 
the PR:


+void g (unsigned int n)
+{
+  if (n == 7)
+n = 11;
+  f (__builtin_alloca (n));
+}

Since we can't get any meaningful information from VR_ANTI_RANGE as we 
drill down to a cast, the appropriate thing is to drop it without 
assuming it has a range.  This was as oversight in not handling 
VR_ANTI_RANGE gracefully, and I'll go on a limb here and say that 
attached patch is obvious.


Committed to trunk.
commit 59256dfdee704d08bcd20f0576abd3353f5767cc
Author: Aldy Hernandez 
Date:   Fri Dec 2 04:42:26 2016 -0500

PR middle-end/78328
* gimple-ssa-warn-alloca.c (alloca_call_type): Handle
VR_ANTI_RANGE.

diff --git a/gcc/gimple-ssa-warn-alloca.c b/gcc/gimple-ssa-warn-alloca.c
index e75f2fa..ae379f9 100644
--- a/gcc/gimple-ssa-warn-alloca.c
+++ b/gcc/gimple-ssa-warn-alloca.c
@@ -339,6 +339,8 @@ alloca_call_type (gimple *stmt, bool is_vla, tree 
*invalid_casted_type)
{
  // Fall through.
}
+ else if (range_type == VR_ANTI_RANGE)
+   return alloca_type_and_limit (ALLOCA_UNBOUNDED);
  else if (range_type != VR_VARYING)
return
  alloca_type_and_limit (ALLOCA_BOUND_MAYBE_LARGE, max);
diff --git a/gcc/testsuite/gcc.dg/Walloca-12.c 
b/gcc/testsuite/gcc.dg/Walloca-12.c
new file mode 100644
index 000..5d71cda
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Walloca-12.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-Walloca-larger-than=128 -O2" } */
+
+void f (void*);
+
+void g (unsigned int n)
+{
+  if (n == 7)
+n = 11;
+  f (__builtin_alloca (n)); /* { dg-warning "unbounded use of 'alloca'" } */
+}

[PATCH, Fortran, v1] Fix deallocation of nested derived typed components

2016-12-02 Thread Andre Vehreschild

Hi all,

attached patch fixes on ICE, when freeing a scalar allocatable component in a
derived typed coarray.

Furthermore does it fix freeing of nested derived typed allocatable components.
A simple code explains the bug that is solved by the patch:

type inner
  integer, allocatable :: i
end type
type outer
  type(inner), allocatable :: link
end type

type(outer), allocatable :: obj

allocate(obj)
allocate(obj%link)
allocate(obj%link%i)

deallocate(obj%link)
deallocate(obj) ! <- this will generate pseudo-pseudo-code of the kind:

if (obj.link.i != 0)  // But link is already NULL, i.e. a crash occurs.
  free(obj.link.i)

The patch fixes this by moving the code for freeing link.i into the check if
link is allocated, i.e.:

if (obj.link != 0) {
  if (obj.link.i != 0)  {
free (obj.link.i);
obj.link.i = 0;
  }
  free (obj.link);
  obj.link = 0;
}

Furthermore does the patch ensure that the handle of an allocatable component is
set to 0.

Bootstraped and regtested ok on x86_64-linux/F23. Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/fortran/ChangeLog:

2016-12-02  Andre Vehreschild  

* trans-array.c (structure_alloc_comps): Restructure deallocation of
(nested) allocatable components.  Insert dealloc of sub-component into
the block guarded by the if != NULL for the component.
* trans.c (gfc_deallocate_with_status): Allow deallocation of scalar
and arrays as well as coarrays.
(gfc_deallocate_scalar_with_status): Get the data member for coarrays
only when freeing an array with descriptor.
* trans.h: Change prototype of gfc_deallocate_with_status to allow
adding statements into the block guarded by the if (pointer != 0) and
supply a coarray handle.

gcc/testsuite/ChangeLog:

2016-12-02  Andre Vehreschild  

* gfortran.dg/coarray_alloc_comp_3.f08: New test.
* gfortran.dg/finalize_18.f90: Add count for additional guard against
accessing null-pointer.

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index ac90a4b..a5deb42 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -8157,8 +8157,11 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
   tree null_cond = NULL_TREE;
   tree add_when_allocated;
   tree dealloc_fndecl;
-  bool called_dealloc_with_status;
+  tree caf_token;
   gfc_symbol *vtab;
+  int caf_dereg_mode;
+  symbol_attribute *attr;
+  bool deallocate_called;
 
   gfc_init_block (&fnblock);
 
@@ -8265,7 +8268,8 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
   bool cmp_has_alloc_comps = (c->ts.type == BT_DERIVED
   || c->ts.type == BT_CLASS)
 && c->ts.u.derived->attr.alloc_comp;
-  bool same_type = c->ts.type == BT_DERIVED && der_type == c->ts.u.derived;
+  bool same_type = (c->ts.type == BT_DERIVED && der_type == c->ts.u.derived)
+	|| (c->ts.type == BT_CLASS && der_type == CLASS_DATA (c)->ts.u.derived);
 
   cdecl = c->backend_decl;
   ctype = TREE_TYPE (cdecl);
@@ -8274,112 +8278,120 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 	{
 	case DEALLOCATE_ALLOC_COMP:
 
-	  /* gfc_deallocate_scalar_with_status calls gfc_deallocate_alloc_comp
-	 (i.e. this function) so generate all the calls and suppress the
-	 recursion from here, if necessary.  */
-	  called_dealloc_with_status = false;
 	  gfc_init_block (&tmpblock);
 
-	  if ((c->ts.type == BT_DERIVED && !c->attr.pointer)
-	  || (c->ts.type == BT_CLASS && !CLASS_DATA (c)->attr.class_pointer))
-	{
-	  comp = fold_build3_loc (input_location, COMPONENT_REF, ctype,
-  decl, cdecl, NULL_TREE);
+	  comp = fold_build3_loc (input_location, COMPONENT_REF, ctype,
+  decl, cdecl, NULL_TREE);
 
-	  /* The finalizer frees allocatable components.  */
-	  called_dealloc_with_status
-		= gfc_add_comp_finalizer_call (&tmpblock, comp, c,
-	   purpose == DEALLOCATE_ALLOC_COMP
-	   && caf_enabled (caf_mode));
-	}
+	  /* Shortcut to get the attributes of the component.  */
+	  if (c->ts.type == BT_CLASS)
+	attr = &CLASS_DATA (c)->attr;
 	  else
-	comp = NULL_TREE;
+	attr = &c->attr;
 
-	  if (c->attr.allocatable && !c->attr.proc_pointer && !same_type
-	  && (c->attr.dimension
-		  || (caf_enabled (caf_mode)
-		  && (caf_in_coarray (caf_mode) || c->attr.codimension
-	{
-	  /* Allocatable arrays or coarray'ed components (scalar or
-		 array).  */
-	  int caf_dereg_mode
-		  = (caf_in_coarray (caf_mode) || c->attr.codimension)
-		  ? (gfc_caf_is_dealloc_only (caf_mode)
-		 ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
-		 : GFC_CAF_COARRAY_DEREGISTER)
-		  : GFC_CAF_COARRAY_NOCOARRAY;
-	  if (comp == NULL_TREE)
-		comp = fold_build3_loc (input_location, COMPONENT_REF, ctype,
-	decl, cdecl, NULL_TREE);
+	  if ((c->ts.type == BT_DERIVED && !c->attr.pointer)
+	 || (c->ts.type == BT_CLASS && !CLASS_DATA (c)->attr.cla

Re: [RFC][PATCH] Speed-up use-after-scope (re-writing to SSA)

2016-12-02 Thread Richard Biener

On Thu, Dec 1, 2016 at 5:30 PM, Martin Liška  wrote:
> On 11/23/2016 03:13 PM, Jakub Jelinek wrote:
>> On Wed, Nov 23, 2016 at 02:57:07PM +0100, Martin Liška wrote:
>>> I started review process in libsanitizer: https://reviews.llvm.org/D26965
>>> And I have a question that was asked in the review: can we distinguish 
>>> between load and store
>>> in case of having usage of ASAN_POISON?
>>
>> I think with ASAN_POISON it is indeed just loads from after scope that can
>> be caught, a store overwrites the variable with a new value and when turning
>> the store after we make the var no longer addressable into SSA form, we
>> loose information about the out of scope store.  Furthermore, if there is
>> first a store and then a read, like:
>>   if (argc != 12312)
>> {
>>   char my_char;
>>   ptr = &my_char;
>> }
>>   *ptr = i + 26;
>>   return *ptr;
>> we don't notice even the read.  Not sure what could be done against that
>> though.  I think we'd need to hook into the into-ssa framework, there it
>> should know the current value of the variable at the point of the store is
>> result of ASAN_POISON and be able to instead of turning that
>>   my_char = _23;
>> into
>>   my_char_35 = _23;
>> turn it into:
>>   my_char_35 = ASAN_POISON (_23);
>> which would represent after scope store into my_char.
>>
>> Not really familiar with into-ssa though to know where to do it.
>>
>>   Jakub
>>
>
> Richi, may I ask you for help with this question?

Probably where we handle the CLOBBER case (rewrite_stmt, maybe_register_def),
we do this for -Wuninitialized.

Richard.

> Thanks,
> Martin
>

Re: [PATCH] S/390: Fix setmem-long test.

2016-12-02 Thread Andreas Krebbel

On Thu, Dec 01, 2016 at 03:47:22PM +0100, Dominik Vogt wrote:
> gcc/testsuite/ChangeLog-setmem-long-test
> 
>   * gcc.target/s390/md/setmem_long-1.c: Fix test.

Applied. Thanks!

-Andreas-

Re: [PING] (v2) Add a "compact" mode to print_rtx_function

2016-12-02 Thread Andreas Krebbel

On Thu, Dec 01, 2016 at 01:27:55PM +0100, Bernd Schmidt wrote:
> On 12/01/2016 11:12 AM, Dominik Vogt wrote:
...
> >I'd like to get our test failure fixed, either by changing
> >print-rtl.c or our test case.  Is the above patch good for trunk?
> >It does fix the s390 test failure.
> 
> I still don't see a strong reason not to print the quotes, so I'd
> suggest changing the testcase.

Ok. I've just committed
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00084.html

Bye,

-Andreas-

[PATCH] formatting nit

2016-12-02 Thread Nathan Sidwell


Another formatting nit I noticed.  Committed.

nathan
--
Nathan Sidwell
2016-12-02  Nathan Sidwell  

	* diagnostic.c (diagnostic_report_diagnostic): Remove extraneous
	braces.

Index: diagnostic.c
===
--- diagnostic.c	(revision 243176)
+++ diagnostic.c	(working copy)
@@ -834,9 +834,7 @@ diagnostic_report_diagnostic (diagnostic
  -Wno-error=*.  */
   if (context->warning_as_error_requested
   && diagnostic->kind == DK_WARNING)
-{
-  diagnostic->kind = DK_ERROR;
-}
+diagnostic->kind = DK_ERROR;
 
   if (diagnostic->option_index
   && diagnostic->option_index != permissive_error_option (context))

[PATCH] fix -fmax-errors & notes, take 2

2016-12-02 Thread Nathan Sidwell


Hi,
this respin of my notes patch from October 
(https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00706.html) addresses the 
fortran problems encountered.


I introduced a new function and call it from the fortran error machinery 
at an appropriate point.


ok?

nathan
--
Nathan Sidwell
2016-12-02  Nathan Sidwell  

	gcc/
	* diagnostic.c (diagnostic_check_max_errors): New, broken out of ...
	(diagnostic_action_after_output): ... here.
	(diagnostic_report_diagnostic): Call it for non-notes.
	* diagnostic.h (diagnostic_check_max_errors): Declare.

	gcc/fortran/
	* error.c (gfc_warning_check): Call diagnostic_check_max_errors.
	(gfc_error_check): Likewise.

	gcc/testsuite/
	* c-c++-common/fmax_errors.c: Check notes after last error are
	emitted.

Index: gcc/diagnostic.c
===
--- gcc/diagnostic.c	(revision 243115)
+++ gcc/diagnostic.c	(working copy)
@@ -446,6 +446,29 @@ bt_err_callback (void *data ATTRIBUTE_UN
 	   errnum == 0 ? "" : xstrerror (errnum));
 }
 
+/* Check if we've met the maximum error limit.  */
+
+void
+diagnostic_check_max_errors (diagnostic_context *context, bool flush)
+{
+  if (!context->max_errors)
+return;
+
+  int count = (diagnostic_kind_count (context, DK_ERROR)
+	   + diagnostic_kind_count (context, DK_SORRY)
+	   + diagnostic_kind_count (context, DK_WERROR));
+
+  if (count >= (int) context->max_errors)
+{
+  fnotice (stderr,
+	   "compilation terminated due to -fmax-errors=%u.\n",
+	   context->max_errors);
+  if (flush)
+	diagnostic_finish (context);
+  exit (FATAL_EXIT_CODE);
+}
+}
+
 /* Take any action which is expected to happen after the diagnostic
is written out.  This function does not always return.  */
 void
@@ -470,18 +493,6 @@ diagnostic_action_after_output (diagnost
 	  diagnostic_finish (context);
 	  exit (FATAL_EXIT_CODE);
 	}
-  if (context->max_errors != 0
-	  && ((unsigned) (diagnostic_kind_count (context, DK_ERROR)
-			  + diagnostic_kind_count (context, DK_SORRY)
-			  + diagnostic_kind_count (context, DK_WERROR))
-	  >= context->max_errors))
-	{
-	  fnotice (stderr,
-		   "compilation terminated due to -fmax-errors=%u.\n",
-		   context->max_errors);
-	  diagnostic_finish (context);
-	  exit (FATAL_EXIT_CODE);
-	}
   break;
 
 case DK_ICE:
@@ -892,6 +901,9 @@ diagnostic_report_diagnostic (diagnostic
 	return false;
 }
 
+  if (diagnostic->kind != DK_NOTE)
+diagnostic_check_max_errors (context);
+
   context->lock++;
 
   if (diagnostic->kind == DK_ICE || diagnostic->kind == DK_ICE_NOBT)
Index: gcc/diagnostic.h
===
--- gcc/diagnostic.h	(revision 243115)
+++ gcc/diagnostic.h	(working copy)
@@ -320,6 +320,7 @@ void default_diagnostic_start_span_fn (d
 void default_diagnostic_finalizer (diagnostic_context *, diagnostic_info *);
 void diagnostic_set_caret_max_width (diagnostic_context *context, int value);
 void diagnostic_action_after_output (diagnostic_context *, diagnostic_t);
+void diagnostic_check_max_errors (diagnostic_context *, bool flush = false);
 
 void diagnostic_file_cache_fini (void);
 
Index: gcc/fortran/error.c
===
--- gcc/fortran/error.c	(revision 243115)
+++ gcc/fortran/error.c	(working copy)
@@ -1226,6 +1226,7 @@ gfc_warning_check (void)
   diagnostic_action_after_output (global_dc,
   warningcount_buffered
   ? DK_WARNING : DK_ERROR);
+  diagnostic_check_max_errors (global_dc, true);
 }
 }
 
@@ -1370,6 +1371,7 @@ gfc_error_check (void)
   gcc_assert (gfc_output_buffer_empty_p (pp_error_buffer));
   pp->buffer = tmp_buffer;
   diagnostic_action_after_output (global_dc, DK_ERROR);
+  diagnostic_check_max_errors (global_dc, true);
   return true;
 }
 
Index: gcc/testsuite/c-c++-common/fmax-errors.c
===
--- gcc/testsuite/c-c++-common/fmax-errors.c	(revision 243115)
+++ gcc/testsuite/c-c++-common/fmax-errors.c	(working copy)
@@ -1,11 +1,21 @@
 /* PR c/44782 */
 /* { dg-do compile } */
-/* { dg-options "-fmax-errors=3" } */
+/* { dg-options "-fmax-errors=3 -Wall" } */
 
 void foo (unsigned int i, unsigned int j)
 {
   (i) ();			/* { dg-error "" } */
   (j) ();			/* { dg-error "" } */
-  (i+j) ();			/* { dg-error "" } */
+
+  i + j; /* { dg-warning "" }  */
+
+  (k) ();			/* { dg-error "" } */
+  /* Make sure we see the notes related to the final error we emit.  */
+  /* { dg-message "identifier" "" { target c } 12 } */
+
+  /* Warnings after the final error should not appear.  */
+  i + j; /* no warning.  */
+
   (i*j) ();			/* no error here due to -fmax-errors */
+
 } /* { dg-prune-output "compilation terminated" } */

Re: [PATCH] EVRP edge info refactoring and simple improvement

2016-12-02 Thread Richard Biener

On Fri, 2 Dec 2016, Richard Biener wrote:

> 
> The following refactors range extraction from edges and makes EVRP
> able to merge such edge based ranges for the case of multiple 
> predecessors.  This allows it to catch anti-ranges from
> if (a < 4 || a > 8) if that is not re-written to a single test like
> when using gotos.
> 
> I don't expect this to catch very much in practice but the refactoring
> means we can incorporate the tricks from register_edge_assert_for
> more easily (we "simply" have to use extract_ranges_from_edge as the
> one-and-true kind-of interface).

Like the following, preliminary testing shows

FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
minv_[0-9]* == 5 to 0"
FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
minv_[0-9]* == 6 to 0"
FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
maxv_[0-9]* == 5 to 0"
FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
maxv_[0-9]* == 6 to 0"
FAIL: gcc.dg/tree-ssa/vrp35.c scan-tree-dump vrp1 "Removing dead stmt 
[^\r\n]* = j_.* == 10"
FAIL: gcc.dg/tree-ssa/vrp36.c scan-tree-dump vrp1 "Removing dead stmt 
[^\r\n]* = i_.* == 1"

so more things are done by EVRP.  More testing next week because
I expected more VRP testcase fallout.  But as expected this allows
for the single-test if (a < 4 || a > 8) variant.

Richard.

2016-12-02  Richard Biener  

* tree-vrp.c (assert_info): New struct.
(add_assert_info): New helper.
(register_edge_assert_for_2): Refactor to add asserts to a vector
of assert_info.
(register_edge_assert_for_1): Likewise.
(register_edge_assert_for): Likewise.
(finish_register_edge_assert_for): New helper actually registering
asserts where live on edge.
(find_conditional_asserts): Adjust.
(find_switch_asserts): Likewise.
(evrp_dom_walker::try_find_new_range): Generalize.
(evrp_dom_walker::extract_ranges_from_edge): Use
register_edge_assert_for.
(evrp_dom_walker::before_dom_children): Adjust.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 592d3b0..62d0e9d 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -89,6 +89,21 @@ static tree vrp_evaluate_conditional_warnv_with_ops (enum 
tree_code,
 tree, tree, bool, bool *,
 bool *);
 
+struct assert_info
+{
+  /* Predicate code for the ASSERT_EXPR.  Must be COMPARISON_CLASS_P.  */
+  enum tree_code comp_code;
+
+  /* Name to register the assert for.  */
+  tree name;
+
+  /* Value being compared against.  */
+  tree val;
+
+  /* Expression to compare.  */
+  tree expr;
+};
+
 /* Location information for ASSERT_EXPRs.  Each instance of this
structure describes an ASSERT_EXPR for an SSA name.  Since a single
SSA name may have more than one assertion associated with it, these
@@ -4956,6 +4971,19 @@ debug_all_asserts (void)
   dump_all_asserts (stderr);
 }
 
+/* Push the assert info for NAME, EXPR, COMP_CODE and VAL to ASSERTS.  */
+
+static void
+add_assert_info (vec &asserts,
+tree name, tree expr, enum tree_code comp_code, tree val)
+{
+  assert_info info;
+  info.comp_code = comp_code;
+  info.name = name;
+  info.val = val;
+  info.expr = expr;
+  asserts.safe_push (info);
+}
 
 /* If NAME doesn't have an ASSERT_EXPR registered for asserting
'EXPR COMP_CODE VAL' at a location that dominates block BB or
@@ -5172,9 +5200,10 @@ masked_increment (const wide_int &val_in, const wide_int 
&mask,
Invert the condition COND if INVERT is true.  */
 
 static void
-register_edge_assert_for_2 (tree name, edge e, gimple_stmt_iterator bsi,
+register_edge_assert_for_2 (tree name, edge e,
enum tree_code cond_code,
-   tree cond_op0, tree cond_op1, bool invert)
+   tree cond_op0, tree cond_op1, bool invert,
+   vec &asserts)
 {
   tree val;
   enum tree_code comp_code;
@@ -5185,10 +5214,8 @@ register_edge_assert_for_2 (tree name, edge e, 
gimple_stmt_iterator bsi,
invert, &comp_code, &val))
 return;
 
-  /* Only register an ASSERT_EXPR if NAME was found in the sub-graph
- reachable from E.  */
-  if (live_on_edge (e, name))
-register_new_assert_for (name, name, comp_code, val, NULL, e, bsi);
+  /* Queue the assert.   */
+  add_assert_info (asserts, name, name, comp_code, val);
 
   /* In the case of NAME <= CST and NAME being defined as
  NAME = (unsigned) NAME2 + CST2 we can assert NAME2 >= -CST2
@@ -5228,8 +5255,7 @@ register_edge_assert_for_2 (tree name, edge e, 
gimple_stmt_iterator bsi,
  && TREE_CODE (name3) == SSA_NAME
  && (cst2 == NULL_TREE
  || TREE_CODE (cst2) == INTEGER_CST)
- && INTEGRAL_TYPE_P (TREE_TYPE (name3))
- && live_on_edge (e, name3))
+ &&

Re: [PATCH GCC]Simplify (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2)

2016-12-02 Thread Richard Biener

On Fri, Dec 2, 2016 at 12:58 PM, Bin.Cheng  wrote:
> On Wed, Nov 30, 2016 at 3:10 PM, Richard Biener
>  wrote:
>> On Fri, Nov 18, 2016 at 5:53 PM, Bin Cheng  wrote:
>>> Hi,
>>> This is a rework of 
>>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02007.html.
>>> Though review comments suggested it could be merged with last kind 
>>> simplification
>>> of fold_cond_expr_with_comparison, it's not really applicable.  As a matter 
>>> of fact,
>>> the suggestion stands for patch 
>>> @https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html.
>>> I had previous patch 
>>> (https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html)
>>> moving fold_cond_expr_with_comparison to match.pd pattern and incorporated
>>> that patch into it.  For this one, the rework is trivial, just renames 
>>> several variable
>>> tags as suggested.  Bootstrap and test on x86_64 and AArch64, is it OK?
>>
>> + A) Operand x is a unsigned to signed type conversion and c1 is
>> +   integer zero.  In this case,
>> + (signed type)x  < 0  <=>  x  > MAX_VAL(signed type)
>> + (signed type)x >= 0  <=>  x <= MAX_VAL(signed type)
>>
>> for (singed type)x < 0 -> x > signed-type-max we probably do a reverse
>> "canonicalization" transform?  Yeah,
>>
>> /* Non-equality compare simplifications from fold_binary  */
>> (for cmp (lt gt le ge)
>> ...
>>  (if (wi::eq_p (@1, signed_max)
>>   && TYPE_UNSIGNED (arg1_type)
>>   /* We will flip the signedness of the comparison operator
>>  associated with the mode of @1, so the sign bit is
>>  specified by this mode.  Check that @1 is the signed
>>  max associated with this sign bit.  */
>>   && prec == GET_MODE_PRECISION (TYPE_MODE (arg1_type))
>>   /* signed_type does not work on pointer types.  */
>>   && INTEGRAL_TYPE_P (arg1_type))
>>   /* The following case also applies to X < signed_max+1
>>  and X >= signed_max+1 because previous transformations.  */
>>   (if (cmp == LE_EXPR || cmp == GT_EXPR)
>>(with { tree st = signed_type_for (arg1_type); }
>> (if (cmp == LE_EXPR)
>>  (ge (convert:st @0) { build_zero_cst (st); })
>>  (lt (convert:st @0) { build_zero_cst (st); }))
>>
>> +   if (cmp_code == GE_EXPR)
>> + cmp_code = LE_EXPR;
>> +   c1 = wide_int_to_tree (op_type, wi::max_value (to_type));
>> + }
>> ...
>> +   if (op == PLUS_EXPR)
>> + real_c1 = wide_int_to_tree (op_type,
>> + wi::sub (c3, c2, sgn, &overflow));
>> +   else
>> + real_c1 = wide_int_to_tree (op_type,
>> + wi::add (c3, c2, sgn, &overflow));
>>
>> can you avoid the tree building here and just continue using wide-ints 
>> please?
>> Simply do the wide_int_to_tree in the result patterns.
> Hi,
> I updated patch wrto your comments, also deleted two useless
> variables.  Bootstrap and test, is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-12-01  Bin Cheng  
>
> * match.pd: Add new pattern:
> (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2).
>
> gcc/testsuite/ChangeLog
> 2016-12-01  Bin Cheng  
>
> * gcc.dg/fold-bopcond-1.c: New test.
> * gcc.dg/fold-bopcond-2.c: New test.
>
>>
>> Otherwise looks ok to me.
>>
>> Thanks,
>> Richard.
>>
>>
>>> Thanks,
>>> bin
>>>
>>> 2016-11-17  Bin Cheng  
>>>
>>> * match.pd: Add new pattern:
>>> (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2).
>>>
>>> gcc/testsuite/ChangeLog
>>> 2016-11-17  Bin Cheng  
>>>
>>> * gcc.dg/fold-bopcond-1.c: New test.
>>> * gcc.dg/fold-bopcond-2.c: New test.

Re: [PATCHv2 6/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call

2016-12-02 Thread Andre Vieira (lists)

On 23/11/16 11:53, Andre Vieira (lists) wrote:
> On 11/11/16 16:19, Kyrill Tkachov wrote:
>> And CC'ing Ramana and Richard this time...
>>
> 
> Hi,
> 
> After some extra testing I found that the sibcall optimization was not
> disabled for calls to function pointers with the cmse_nonsecure_call
> attribute, causing the clearing and call to the function wrapper to be
> skipped. This would result in an illegal branch into secure memory and
> would HardFault.
> 
> Added a test.
> 
> Is this OK?
> 
> Cheers,
> Andre
> 
> *** gcc/ChangeLog ***
> 2016-11-xx  Andre Vieira
> Thomas Preud'homme  
> 
> * config/arm/arm.c (detect_cmse_nonsecure_call): New.
> (cmse_nonsecure_call_clear_caller_saved): New.
> (arm_reorg): Use cmse_nonsecure_call_clear_caller_saved.
> (arm_function_ok_for_sibcall): Disable sibcalls for
> cmse_nonsecure_call.
> * config/arm/arm-protos.h (detect_cmse_nonsecure_call): New.
> * config/arm/arm.md (call): Handle cmse_nonsecure_entry.
> (call_value): Likewise.
> (nonsecure_call_internal): New.
> (nonsecure_call_value_internal): New.
> * config/arm/thumb1.md (*nonsecure_call_reg_thumb1_v5): New.
> (*nonsecure_call_value_reg_thumb1_v5): New.
> * config/arm/thumb2.md (*nonsecure_call_reg_thumb2): New.
> (*nonsecure_call_value_reg_thumb2): New.
> * config/arm/unspecs.md (UNSPEC_NONSECURE_MEM): New.
> 
> *** libgcc/ChangeLog ***
> 2016-11-xx  Andre Vieira
> Thomas Preud'homme  
> 
> * config/arm/cmse_nonsecure_call.S: New.
>   * config/arm/t-arm: Compile cmse_nonsecure_call.S
> 
> 
> *** gcc/testsuite/ChangeLog ***
> 2016-11-xx  Andre Vieira
> Thomas Preud'homme  
> 
> * gcc.target/arm/cmse/cmse.exp: Run tests in mainline dir.
> * gcc.target/arm/cmse/cmse-9.c: Added some extra tests.
> * gcc.target/arm/cmse/cmse-14.c: New.
> * gcc.target/arm/cmse/baseline/bitfield-4.c: New.
> * gcc.target/arm/cmse/baseline/bitfield-5.c: New.
> * gcc.target/arm/cmse/baseline/bitfield-6.c: New.
> * gcc.target/arm/cmse/baseline/bitfield-7.c: New.
> * gcc.target/arm/cmse/baseline/bitfield-8.c: New.
> * gcc.target/arm/cmse/baseline/bitfield-9.c: New.
> * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: New.
> * gcc.target/arm/cmse/baseline/cmse-11.c: New.
>   * gcc.target/arm/cmse/baseline/cmse-13.c: New.
>   * gcc.target/arm/cmse/baseline/cmse-6.c: New.
> * gcc/testsuite/gcc.target/arm/cmse/baseline/union-1.c: New.
> * gcc/testsuite/gcc.target/arm/cmse/baseline/union-2.c: New.
>   * gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: New.
>   * gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: New.
>   * gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: New.
>   * gcc.target/arm/cmse/mainline/hard/cmse-13.c: New.
>   * gcc.target/arm/cmse/mainline/hard/cmse-7.c: New.
>   * gcc.target/arm/cmse/mainline/hard/cmse-8.c: New.
>   * gcc.target/arm/cmse/mainline/soft/cmse-13.c: New.
>   * gcc.target/arm/cmse/mainline/soft/cmse-7.c: New.
>   * gcc.target/arm/cmse/mainline/soft/cmse-8.c: New.
>   * gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: New.
>   * gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: New.
>   * gcc.target/arm/cmse/mainline/softfp/cmse-13.c: New.
>   * gcc.target/arm/cmse/mainline/softfp/cmse-7.c: New.
>   * gcc.target/arm/cmse/mainline/softfp/cmse-8.c: New.
> 
Hi,

To make the clearing of registers consistent between single and double
precision I decided to clear all FP registers with 0. The callee-saved
registers, saved, cleared and restored in the library wrapper we can do
this without much penalty to performance. The caller-saved registers are
compiler generated and currently generate a 'vldr' instruction, per
cleared (sp or dp) register. This is far from optimal, but it works and
it is "safer". I have some ideas to improve this, for instance using
r0-r1 to clear the FP registers, since they will either contain the
address of the callback function or an argument value, either way they
will never contain secret information. I will address this at a later time.

Changed the tests to reflect these changes. No changes to the ChangeLog.

Is this OK?

Cheers,
Andre

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 
634a5de05472d562972d9ad84b5858691715714c..05d73ab2271d28ce8d6d49199341d4365f0a2bcb
 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -137,6 +137,7 @@ extern int arm_const_double_inline_cost (rtx);
 extern bool arm_const_double_by_parts (rtx);
 extern bool arm_const_double_by_immediates (rtx);
 extern void arm_emit_call_insn (rtx, rtx, bool);
+bool detect_cmse_nonsecure_call (tree);
 extern const char *output_call (rtx *);
 void arm_emit_movpair (rtx, rtx);
 extern const char *output_mov_long_double_arm_fr

[Patch, Fortran, OOP] PR 43207: [OOP] invalid (pointer) assignment to and from abstract non-polymorphic expressions

2016-12-02 Thread Janus Weil

Hi all,

the attached patch fixes the PR in the subject line by introducing a
new check to reject invalid code. It's a slight update of an old patch
that I posted in the PR quite some time ago, using somewhat tighter
checking to avoid side effects on the testsuite.

Regtests cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus


2016-12-02  Janus Weil  

PR fortran/43207
* primary.c (gfc_match_varspec): Reject nonpolymorphic references to
abstract types.

2016-12-02  Janus Weil  

PR fortran/43207
* gfortran.dg/abstract_type_9.f90: New test case.
Index: gcc/fortran/primary.c
===
--- gcc/fortran/primary.c   (revision 243176)
+++ gcc/fortran/primary.c   (working copy)
@@ -,7 +,15 @@ check_substring:
}
 }
 
-  /* F2008, C727.  */
+  /* F08:C611.  */
+  if (primary->ts.type == BT_DERIVED && primary->ref
+  && primary->ts.u.derived && primary->ts.u.derived->attr.abstract)
+{
+  gfc_error ("Nonpolymorphic reference to abstract type at %C");
+  return MATCH_ERROR;
+}
+
+  /* F08:C727.  */
   if (primary->expr_type == EXPR_PPC && gfc_is_coindexed (primary))
 {
   gfc_error ("Coindexed procedure-pointer component at %C");
! { dg-do compile }
!
! PR 43207: [OOP] invalid (pointer) assignment to and from abstract non-polymorphic expressions
!
! Contributed by Tobias Burnus 

  implicit none
  type, abstract :: parent
integer :: i
  end type
  type, extends(parent) :: child
class(parent), pointer :: comp
  end type

  type(child), target :: c1
  class(child), allocatable :: c2
  class(parent), pointer :: cp

  c1%parent = c1%parent  ! { dg-error "Nonpolymorphic reference to abstract type" }
  c2%parent = c1%parent  ! { dg-error "Nonpolymorphic reference to abstract type" }

  cp => c1%comp
  cp => c1%parent! { dg-error "Nonpolymorphic reference to abstract type" }

  call sub(c1%comp)
  call sub(c1%parent)! { dg-error "Nonpolymorphic reference to abstract type" }

contains

  subroutine sub(arg)
class(parent) :: arg
  end subroutine

end

Re: [PATCHv2 6/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call

2016-12-02 Thread Kyrill Tkachov


Hi Andre,

On 02/12/16 13:36, Andre Vieira (lists) wrote:

On 23/11/16 11:53, Andre Vieira (lists) wrote:

On 11/11/16 16:19, Kyrill Tkachov wrote:

And CC'ing Ramana and Richard this time...


Hi,

After some extra testing I found that the sibcall optimization was not
disabled for calls to function pointers with the cmse_nonsecure_call
attribute, causing the clearing and call to the function wrapper to be
skipped. This would result in an illegal branch into secure memory and
would HardFault.

Added a test.

Is this OK?

Cheers,
Andre

*** gcc/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * config/arm/arm.c (detect_cmse_nonsecure_call): New.
 (cmse_nonsecure_call_clear_caller_saved): New.
 (arm_reorg): Use cmse_nonsecure_call_clear_caller_saved.
 (arm_function_ok_for_sibcall): Disable sibcalls for
cmse_nonsecure_call.
 * config/arm/arm-protos.h (detect_cmse_nonsecure_call): New.
 * config/arm/arm.md (call): Handle cmse_nonsecure_entry.
 (call_value): Likewise.
 (nonsecure_call_internal): New.
 (nonsecure_call_value_internal): New.
 * config/arm/thumb1.md (*nonsecure_call_reg_thumb1_v5): New.
 (*nonsecure_call_value_reg_thumb1_v5): New.
 * config/arm/thumb2.md (*nonsecure_call_reg_thumb2): New.
 (*nonsecure_call_value_reg_thumb2): New.
 * config/arm/unspecs.md (UNSPEC_NONSECURE_MEM): New.

*** libgcc/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * config/arm/cmse_nonsecure_call.S: New.
* config/arm/t-arm: Compile cmse_nonsecure_call.S


*** gcc/testsuite/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse.exp: Run tests in mainline dir.
 * gcc.target/arm/cmse/cmse-9.c: Added some extra tests.
 * gcc.target/arm/cmse/cmse-14.c: New.
 * gcc.target/arm/cmse/baseline/bitfield-4.c: New.
 * gcc.target/arm/cmse/baseline/bitfield-5.c: New.
 * gcc.target/arm/cmse/baseline/bitfield-6.c: New.
 * gcc.target/arm/cmse/baseline/bitfield-7.c: New.
 * gcc.target/arm/cmse/baseline/bitfield-8.c: New.
 * gcc.target/arm/cmse/baseline/bitfield-9.c: New.
 * gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: New.
 * gcc.target/arm/cmse/baseline/cmse-11.c: New.
* gcc.target/arm/cmse/baseline/cmse-13.c: New.
* gcc.target/arm/cmse/baseline/cmse-6.c: New.
 * gcc/testsuite/gcc.target/arm/cmse/baseline/union-1.c: New.
 * gcc/testsuite/gcc.target/arm/cmse/baseline/union-2.c: New.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: New.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: New.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: New.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: New.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: New.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: New.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: New.
* gcc.target/arm/cmse/mainline/soft/cmse-7.c: New.
* gcc.target/arm/cmse/mainline/soft/cmse-8.c: New.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: New.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: New.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: New.
* gcc.target/arm/cmse/mainline/softfp/cmse-7.c: New.
* gcc.target/arm/cmse/mainline/softfp/cmse-8.c: New.


Hi,

To make the clearing of registers consistent between single and double
precision I decided to clear all FP registers with 0. The callee-saved
registers, saved, cleared and restored in the library wrapper we can do
this without much penalty to performance. The caller-saved registers are
compiler generated and currently generate a 'vldr' instruction, per
cleared (sp or dp) register. This is far from optimal, but it works and
it is "safer". I have some ideas to improve this, for instance using
r0-r1 to clear the FP registers, since they will either contain the
address of the callback function or an argument value, either way they
will never contain secret information. I will address this at a later time.

Changed the tests to reflect these changes. No changes to the ChangeLog.

Is this OK?


Thanks, I much prefer the consistency.
This is ok.
I believe all patches in this series have been approved now, so you can go 
ahead and commit them.
Please keep an eye out for fallout over the next week.

Kyrill


Cheers,
Andre

Re: [PATCH] EVRP edge info refactoring and simple improvement

2016-12-02 Thread Richard Biener

On Fri, 2 Dec 2016, Richard Biener wrote:

> On Fri, 2 Dec 2016, Richard Biener wrote:
> 
> > 
> > The following refactors range extraction from edges and makes EVRP
> > able to merge such edge based ranges for the case of multiple 
> > predecessors.  This allows it to catch anti-ranges from
> > if (a < 4 || a > 8) if that is not re-written to a single test like
> > when using gotos.
> > 
> > I don't expect this to catch very much in practice but the refactoring
> > means we can incorporate the tricks from register_edge_assert_for
> > more easily (we "simply" have to use extract_ranges_from_edge as the
> > one-and-true kind-of interface).
> 
> Like the following, preliminary testing shows
> 
> FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
> minv_[0-9]* == 5 to 0"
> FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
> minv_[0-9]* == 6 to 0"
> FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
> maxv_[0-9]* == 5 to 0"
> FAIL: gcc.dg/tree-ssa/pr49039.c scan-tree-dump vrp1 "Folding predicate 
> maxv_[0-9]* == 6 to 0"
> FAIL: gcc.dg/tree-ssa/vrp35.c scan-tree-dump vrp1 "Removing dead stmt 
> [^\r\n]* = j_.* == 10"
> FAIL: gcc.dg/tree-ssa/vrp36.c scan-tree-dump vrp1 "Removing dead stmt 
> [^\r\n]* = i_.* == 1"
> 
> so more things are done by EVRP.  More testing next week because
> I expected more VRP testcase fallout.  But as expected this allows
> for the single-test if (a < 4 || a > 8) variant.

It also allows to optimize

void f (unsigned a)
{
  if (a < 4 || a > 8)
goto x;
  if (a > 5 && a < 9)
goto x;
  return;

x:
  if (a == 5)
__builtin_abort ();
}

which VRP does not handle, only reassoc can do this (implementing
folds simplifications, so writing (a < 4 || a > 8 || (a > 5 && a < 9))
simplifies this in fold already.

Richard.

> Richard.
> 
> 2016-12-02  Richard Biener  
> 
>   * tree-vrp.c (assert_info): New struct.
>   (add_assert_info): New helper.
>   (register_edge_assert_for_2): Refactor to add asserts to a vector
>   of assert_info.
>   (register_edge_assert_for_1): Likewise.
>   (register_edge_assert_for): Likewise.
>   (finish_register_edge_assert_for): New helper actually registering
>   asserts where live on edge.
>   (find_conditional_asserts): Adjust.
>   (find_switch_asserts): Likewise.
>   (evrp_dom_walker::try_find_new_range): Generalize.
>   (evrp_dom_walker::extract_ranges_from_edge): Use
>   register_edge_assert_for.
>   (evrp_dom_walker::before_dom_children): Adjust.
> 
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index 592d3b0..62d0e9d 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -89,6 +89,21 @@ static tree vrp_evaluate_conditional_warnv_with_ops (enum 
> tree_code,
>tree, tree, bool, bool *,
>bool *);
>  
> +struct assert_info
> +{
> +  /* Predicate code for the ASSERT_EXPR.  Must be COMPARISON_CLASS_P.  */
> +  enum tree_code comp_code;
> +
> +  /* Name to register the assert for.  */
> +  tree name;
> +
> +  /* Value being compared against.  */
> +  tree val;
> +
> +  /* Expression to compare.  */
> +  tree expr;
> +};
> +
>  /* Location information for ASSERT_EXPRs.  Each instance of this
> structure describes an ASSERT_EXPR for an SSA name.  Since a single
> SSA name may have more than one assertion associated with it, these
> @@ -4956,6 +4971,19 @@ debug_all_asserts (void)
>dump_all_asserts (stderr);
>  }
>  
> +/* Push the assert info for NAME, EXPR, COMP_CODE and VAL to ASSERTS.  */
> +
> +static void
> +add_assert_info (vec &asserts,
> +  tree name, tree expr, enum tree_code comp_code, tree val)
> +{
> +  assert_info info;
> +  info.comp_code = comp_code;
> +  info.name = name;
> +  info.val = val;
> +  info.expr = expr;
> +  asserts.safe_push (info);
> +}
>  
>  /* If NAME doesn't have an ASSERT_EXPR registered for asserting
> 'EXPR COMP_CODE VAL' at a location that dominates block BB or
> @@ -5172,9 +5200,10 @@ masked_increment (const wide_int &val_in, const 
> wide_int &mask,
> Invert the condition COND if INVERT is true.  */
>  
>  static void
> -register_edge_assert_for_2 (tree name, edge e, gimple_stmt_iterator bsi,
> +register_edge_assert_for_2 (tree name, edge e,
>   enum tree_code cond_code,
> - tree cond_op0, tree cond_op1, bool invert)
> + tree cond_op0, tree cond_op1, bool invert,
> + vec &asserts)
>  {
>tree val;
>enum tree_code comp_code;
> @@ -5185,10 +5214,8 @@ register_edge_assert_for_2 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>   invert, &comp_code, &val))
>  return;
>  
> -  /* Only register an ASSERT_EXPR if NAME was found in the sub-graph
> - reachable from E.  */
> -  if (live_on_edge (e, name))
> -register

[hsa] Exclude parallel outlines from hsa_callable_functions_p

2016-12-02 Thread Martin Jambor

Hi,

after the merge of nvidia OpenMP implementation, the normal parallel
outline functions were also marked as "omp declare target" which lead
to them being cloned and compiled to HSA which is not only unnecessary
but often leads to a lot of useless HSA warning noise.  The following
patch deal with this issue by making sure they are not considered
callable from HSA.

Bootstrapped and tested on x86_64-linux.  I will commit it to trunk in
a few moments (it is already part of my most recent merge from trunk
to the hsa branch.

Thanks,

Martin

Exclude parallel outlines from hsa_callable_functions_p

2016-11-29  Martin Jambor  

* hsa.c (hsa_callable_function_p): Return false for artificial
  functions.
---
 gcc/hsa.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/hsa.c b/gcc/hsa.c
index f881e78..31e3252 100644
--- a/gcc/hsa.c
+++ b/gcc/hsa.c
@@ -90,7 +90,10 @@ bool
 hsa_callable_function_p (tree fndecl)
 {
   return (lookup_attribute ("omp declare target", DECL_ATTRIBUTES (fndecl))
- && !lookup_attribute ("oacc function", DECL_ATTRIBUTES (fndecl)));
+ && !lookup_attribute ("oacc function", DECL_ATTRIBUTES (fndecl))
+ /* At this point, this is enough to identify clones for
+parallel, which for HSA would need to be kernels anyway.  */
+ && !DECL_ARTIFICIAL (fndecl));
 }
 
 /* Allocate HSA structures that are are used when dealing with different
-- 
2.10.2

C++ PATCH to stop inheriting copy constructors

2016-12-02 Thread Jason Merrill

In C++14 copy constructors were excluded from the inherited
constructor set; Ville noticed that this was changed in the new
inherited constructor rules, and lobbied for us to retain the old
behavior in that case.  This patch implements the proposed resolution.

Tested x86_64-pc-linux-gnu, applied to trunk.
2016-12-01  Jason Merrill  

* call.c (add_function_candidate): Exclude inherited copy/move
ctors.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 97003e5..b7aa97c 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -2042,6 +2042,24 @@ add_function_candidate (struct z_candidate **candidates,
   reason = arity_rejection (first_arg, i + remaining, len);
 }
 
+  /* An inherited constructor (12.6.3 [class.inhctor.init]) that has a first
+ parameter of type "reference to cv C" (including such a constructor
+ instantiated from a template) is excluded from the set of candidate
+ functions when used to construct an object of type D with an argument list
+ containing a single argument if C is reference-related to D.  */
+  if (viable && len == 1 && parmlist && DECL_CONSTRUCTOR_P (fn)
+  && flag_new_inheriting_ctors
+  && DECL_INHERITED_CTOR (fn))
+{
+  tree ptype = non_reference (TREE_VALUE (parmlist));
+  tree dtype = DECL_CONTEXT (fn);
+  if (reference_related_p (ptype, dtype))
+   {
+ viable = false;
+ reason = inherited_ctor_rejection ();
+   }
+}
+
   /* Second, for a function to be viable, its constraints must be
  satisfied. */
   if (flag_concepts && viable
@@ -2142,18 +2160,6 @@ add_function_candidate (struct z_candidate **candidates,
}
}
 
- /* Don't consider inherited constructors for initialization from an
-expression of the same or derived type.  */
- /* FIXME extend to operator=.  */
- if (i == 0 && len == 1
- && DECL_INHERITED_CTOR (fn)
- && reference_related_p (ctype, argtype))
-   {
- viable = 0;
- reason = inherited_ctor_rejection ();
- goto out;
-   }
-
  /* Core issue 899: When [copy-]initializing a temporary to be bound
 to the first parameter of a copy constructor (12.8) called with
 a single argument in the context of direct-initialization,

Re: [PATCH] avoid calling alloca(0)

2016-12-02 Thread Bernd Edlinger

On 12/01/16 19:39, Jeff Law wrote:
> On 11/30/2016 09:09 PM, Martin Sebor wrote:
>>> What I think this tells us is that we're not at a place where we're
>>> clean.  But we can incrementally get there.  The warning is only
>>> catching a fairly small subset of the cases AFAICT.  That's not unusual
>>> and analyzing why it didn't trigger on those cases might be useful as
>>> well.
>>
>> The warning has no smarts.  It relies on constant propagation and
>> won't find a call unless it sees it's being made with a constant
>> zero.  Looking at the top two on the list the calls are in extern
>> functions not called from the same source file, so it probably just
>> doesn't see that the functions are being called from another file
>> with a zero.  Building GCC with LTO might perhaps help.
> Right.  This is consistent with the limitations of other similar
> warnings such as null pointer dereferences.
>
>>
>>> So where does this leave us for gcc-7?  I'm wondering if we drop the
>>> warning in, but not enable it by default anywhere.  We fix the cases we
>>> can (such as reg-stack,c tree-ssa-threadedge.c, maybe others) before
>>> stage3 closes, and shoot for the rest in gcc-8, including improvign the
>>> warning (if there's something we can clearly improve), and enabling the
>>> warning in -Wall or -Wextra.
>>
>> I'm fine with deferring the GCC fixes and working on the cleanup
>> over time but I don't think that needs to gate enabling the option
>> with -Wextra.  The warnings can be suppressed or prevented from
>> causing errors during a GCC build either via a command line option
>> or by pragma in the code.  AFAICT, from the other warnings I see
>> go by, this is what has been done for -Wno-implicit-fallthrough
>> while those warnings are being cleaned up.  Why not take the same
>> approach here?
> The difference vs implicit fallthrough is that new instances of implicit
> fallthrus aren't likely to be exposed by changes in IL that occur due to
> transformations in the optimizer pipeline.
>
> Given the number of runtime triggers vs warnings, we know there's many
> instances of passing 0 to the allocators that we're not diagnosing. I
> can easily see differences in the early IL (such as those due to
> BRANCH_COST differing for targets) exposing/hiding cases where 0 flows
> into the allocator argument.  Similarly for changes in inlining
> decisions, jump threading, etc for profiled bootstraps.  I'd like to
> avoid playing wack-a-mole right now.
>
> So I'm being a bit more conservative here.  Maybe it'd be appropriate in
> Wextra since that's not enabled by default for GCC builds.
>

actually it is enabled, by -W -Wall ...

>
>>
>> As much as I would like to improve the warning itself I'm also not
>> sure I see much of an opportunity for it.  It's not prone to high
>> rates of false positives (hardly any in fact) and the cases it
>> misses are those where it simply doesn't see the argument value
>> because it's not been made available by constant propagation.
> There's always ways :-)   For example, I wouldn't be at all surprised if
> you found PHIs that feed the allocation where one or more of the PHI
> arguments are 0.
>
>
>>
>> That said, I consider the -Walloc-size-larger-than warning to be
>> the more important part of the patch by far.  I'd hate a lack of
>> consensus on how to deal with GCC's handful of instances of
>> alloca(0) to stall the rest of the patch.
> Agreed on not wanting alloca(0) handling to stall the rest of the patch.

I'd say negative arguments on alloca should always diagnosed,
as that does increment the stack in reverse direction.
And that is always very dangerous.

I think the default of -Walloc-size-larger-than should be changed as
Martin suggested to SIZE_T_MAX/2, I would make that the default.
And keep alloca and malloc size limits separate warnings.


Bernd.

Re: [PATCH] fix -fmax-errors & notes, take 2

2016-12-02 Thread Bernd Schmidt


On 12/02/2016 02:25 PM, Nathan Sidwell wrote:


+/* Check if we've met the maximum error limit.  */


Arguments should be documented.


+void
+diagnostic_check_max_errors (diagnostic_context *context, bool flush)
+{
+  if (!context->max_errors)
+return;


I prefer to spell that as != 0 since it's not a boolean, but you can 
also leave it.



+  int count = (diagnostic_kind_count (context, DK_ERROR)
+  + diagnostic_kind_count (context, DK_SORRY)
+  + diagnostic_kind_count (context, DK_WERROR));
+
+  if (count >= (int) context->max_errors)


Looks like there are some unnecessary type mismatches leading to this 
cast. Maybe declare max_errors as int and remove the cast?


Otherwise ok.


Bernd

Re: [PATCH] Unset used bit in simplify_replace_* on newly copied rtxs (PR target/78614)

2016-12-02 Thread Jakub Jelinek

On Fri, Dec 02, 2016 at 12:34:13AM +0100, Jakub Jelinek wrote:
> On Thu, Dec 01, 2016 at 11:48:03PM +0100, Bernd Schmidt wrote:
> > On 12/01/2016 11:43 PM, Jakub Jelinek wrote:
> > >
> > >so we'd need to slow shallow_copy_rtx down by adding switch (code)
> > >in there or something similar.
> > 
> > Is there reason to assume it's time-critical? IMO eliminating the potential
> > for error by not having callers need to do this outweighs that concern.
> 
> Well, it isn't an error not to clear it, just missed optimization if some 
> caller
> cares, and most of the shallow_copy_rtx users really don't care, e.g.
> config/i386/i386.md and config/i386/i386.c callers.  cfgexpand.c doesn't
> either, at that point the used bit isn't set, the calls in cselib.c don't care
> either.  In emit-rtl.c one place explicitly sets used flag afterwards, another
> clears it (that one could be removed if the logic is moved down).  The 
> rs6000.c
> use indeed could get rid of the explicit used bit clearing.  Most of the
> other shallow_copy_rtx callers in the middle-end just don't care.
> I think it is really just simplify_replace_fn_rtx where (at least in the
> rs6000 case) it is useful to clear it, similar trick is not used in many
> places after initial unsharing during expansion before which the bit should
> be clear on everything shareable, I saw it in ifcvt.c (set_used_flags
> + some copying + unshare_*_chain afterwards).
> After expansion when everything is unshared, gradually the used bits are no
> longer relevant, they are set on stuff that has been originally unshared and
> clear on newly added rtxes.  Then the reload/postreload
> unshare_all_rtx_again clears it and unshares the shared stuff.
> So it is really just about spots that do the trick of set_used_flags,
> call some functions where clearing of used bit on new stuff is important,
> and finally call copy_rtx_if_shared.

Anyway, here is a patch that changes shallow_copy_rtx,
bootstrapped/regtested on x86_64-linux and i686-linux.  I believe only the
callers in the patch of the function actually care about used being 0 though
(including the simplify-rtx.c hunks).

2016-11-30  Jakub Jelinek  

PR target/78614
* rtl.c (copy_rtx): Don't clear used flag here.
(shallow_copy_rtx_stat): Clear used flag here unless code the rtx
is shareable.
* simplify-rtx.c (simplify_replace_fn_rtx): When copying rtx with
'E' in format, copy all vectors.
* emit-rtl.c (copy_insn_1): Don't clear used flag here.
* valtrack.c (cleanup_auto_inc_dec): Likewise.
* config/rs6000/rs6000.c (rs6000_frame_related): Likewise.

--- gcc/rtl.c.jj2016-10-31 13:28:12.0 +0100
+++ gcc/rtl.c   2016-12-02 11:01:12.557553040 +0100
@@ -318,10 +318,6 @@ copy_rtx (rtx orig)
  us to explicitly document why we are *not* copying a flag.  */
   copy = shallow_copy_rtx (orig);
 
-  /* We do not copy the USED flag, which is used as a mark bit during
- walks over the RTL.  */
-  RTX_FLAG (copy, used) = 0;
-
   format_ptr = GET_RTX_FORMAT (GET_CODE (copy));
 
   for (i = 0; i < GET_RTX_LENGTH (GET_CODE (copy)); i++)
@@ -367,7 +363,29 @@ shallow_copy_rtx_stat (const_rtx orig ME
 {
   const unsigned int size = rtx_size (orig);
   rtx const copy = ggc_alloc_rtx_def_stat (size PASS_MEM_STAT);
-  return (rtx) memcpy (copy, orig, size);
+  memcpy (copy, orig, size);
+  switch (GET_CODE (orig))
+{
+  /* RTX codes copy_rtx_if_shared_1 considers are shareable,
+the used flag is often used for other purposes.  */
+case REG:
+case DEBUG_EXPR:
+case VALUE:
+CASE_CONST_ANY:
+case SYMBOL_REF:
+case CODE_LABEL:
+case PC:
+case CC0:
+case RETURN:
+case SIMPLE_RETURN:
+case SCRATCH:
+  break;
+default:
+  /* For all other RTXes clear the used flag on the copy.  */
+  RTX_FLAG (copy, used) = 0;
+  break;
+}
+  return copy;
 }
 
 /* Nonzero when we are generating CONCATs.  */
--- gcc/simplify-rtx.c.jj   2016-12-02 00:15:09.200779256 +0100
+++ gcc/simplify-rtx.c  2016-12-02 10:55:24.283989673 +0100
@@ -547,13 +547,19 @@ simplify_replace_fn_rtx (rtx x, const_rt
  old_rtx, fn, data);
if (op != RTVEC_ELT (vec, j))
  {
-   if (newvec == vec)
+   if (x == newx)
  {
-   newvec = shallow_copy_rtvec (vec);
-   if (x == newx)
- newx = shallow_copy_rtx (x);
-   XVEC (newx, i) = newvec;
+   newx = shallow_copy_rtx (x);
+   /* If we copy X, we need to copy also all
+  vectors in it, rather than copy only
+  a subset of them and share the rest.  */
+   for (int k = 0; fmt[k]; k++)
+ if (fmt[k] == 'E')
+   XVEC (newx, k) = shallow_copy_rtvec (XVEC (x, k));
+   newvec

Re: C++ PATCH to stop inheriting copy constructors

2016-12-02 Thread Ville Voutilainen

On 2 December 2016 at 16:00, Jason Merrill  wrote:
> In C++14 copy constructors were excluded from the inherited
> constructor set; Ville noticed that this was changed in the new
> inherited constructor rules, and lobbied for us to retain the old
> behavior in that case.  This patch implements the proposed resolution.


s/lobbied/convinced with reasonable rationale/ :) Smashing, thanks for
the quick fix.

[C++ Patch] PR 78637 ("ICE on invalid C++ code (internal compiler error: in pop_namespace, at cp/name-lookup.c:3826)")

2016-12-02 Thread Paolo Carlini


Hi,

this regression is caused by my fix for c++/60385, where I changed 
push_namespace to early return (a bool) when pushdecl fails. In the 
present testcase, a different push_namespace call in 
cp_parser_namespace_definition, for nested namespace definitions, fails, 
thus returns early, and that is inconsistent with the final loop 
(pop_namespace ICEs):


  /* Finish the nested namespace definitions.  */
  while (nested_definition_count--)
pop_namespace ();

To fix this problem, I think we can simply increment 
nested_definition_count only when push_namespace succeeds. Tested 
x86_64-linux.


Thanks, Paolo.

//

/cp
2016-12-02  Paolo Carlini  

PR c++/78637
* parser.c (cp_parser_namespace_definition): Increment
nested_definition_count only if push_namespace succeeds.

/testsuite
2016-12-02  Paolo Carlini  

PR c++/78637
* g++.dg/parse/namespace14.C: New.
Index: cp/parser.c
===
--- cp/parser.c (revision 243171)
+++ cp/parser.c (working copy)
@@ -18184,8 +18184,8 @@ cp_parser_namespace_definition (cp_parser* parser)
   cp_parser_error (parser, "nested identifier required");
   break;
 }
-  ++nested_definition_count;
-  push_namespace (identifier);
+ if (push_namespace (identifier))
+   ++nested_definition_count;
 }
 }
 
Index: testsuite/g++.dg/parse/namespace14.C
===
--- testsuite/g++.dg/parse/namespace14.C(revision 0)
+++ testsuite/g++.dg/parse/namespace14.C(working copy)
@@ -0,0 +1,6 @@
+// PR c++/78637
+
+namespace X {
+class Y;
+}
+namespace X::Y z;  // { dg-error "namespace|expected|type" }

[PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)

2016-12-02 Thread Jakub Jelinek

Hi!

While the https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01087.html
added *andndi3_doubleword, I don't understand how it can actually work.
The problem is that this pattern is for combine, and needs combining
of DImode NOT setter with DImode AND user.  But there is no DImode
pattern for one_cmpldi2, so it is lowered early into two one_cmplsi2
patterns and I can't see how combine would be able to figure stuff out from
that.

This patch:
1) adds one_cmpldi2 pattern for stv purposes (which splits into two
   one_cmplsi2 after reload)
2) teaches the 32-bit stv pass to handle NOT (as xor all-ones)
3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it
   is for -mbmi only, and adds another *andndi3_doubleword pattern that is
   meant to live just from combine till the stv pass, or worse case till
   following split1 pass when it is split back into not followed by and;
   this change makes it possible to use pandn in stv pass, even without
   -mbmi

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-02  Jakub Jelinek  

PR target/70322
* config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle
NOT.
(dimode_scalar_chain::compute_convert_gain): Likewise.
(dimode_scalar_chain::convert_insn): Likewise.
* config/i386/i386.md (*andndi3_doubleword): New define_insn_and_split.
Rename the old define_insn_and_split to...
(*andndi3_doubleword_bmi): ... this.
(one_cmpl2): Use SWIM1248x iterator instead of SWIM.
(*one_cmpldi2_doubleword): New define_insn_and_split.

* gcc.target/i386/pr70322-1.c: New test.
* gcc.target/i386/pr70322-2.c: New test.
* gcc.target/i386/pr70322-3.c: New test.

--- gcc/config/i386/i386.c.jj   2016-12-02 11:17:40.702995111 +0100
+++ gcc/config/i386/i386.c  2016-12-02 12:01:31.656469089 +0100
@@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx
return false;
   break;
 
+case NOT:
+  break;
+
 case REG:
   return true;
 
@@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx
 
   if ((GET_MODE (XEXP (src, 0)) != DImode
&& !CONST_INT_P (XEXP (src, 0)))
-  || (GET_MODE (XEXP (src, 1)) != DImode
+  || (GET_CODE (src) != NOT
+ && GET_MODE (XEXP (src, 1)) != DImode
  && !CONST_INT_P (XEXP (src, 1
 return false;
 
@@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai
  if (CONST_INT_P (XEXP (src, 1)))
gain -= vector_const_cost (XEXP (src, 1));
}
+  else if (GET_CODE (src) == NOT)
+   gain += ix86_cost->add - COSTS_N_INSNS (1);
   else if (GET_CODE (src) == COMPARE)
{
  /* Assume comparison cost is the same.  */
@@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i
   PUT_MODE (src, V2DImode);
   break;
 
+case NOT:
+  src = XEXP (src, 0);
+  convert_op (&src, insn);
+  subreg = gen_reg_rtx (V2DImode);
+  emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), insn);
+  src = gen_rtx_XOR (V2DImode, src, subreg);
+  break;
+
 case MEM:
   if (!REG_P (dst))
convert_op (&src, insn);
--- gcc/config/i386/i386.md.jj  2016-12-01 23:24:51.663157486 +0100
+++ gcc/config/i386/i386.md 2016-12-02 12:50:27.616829191 +0100
@@ -8534,7 +8534,7 @@ (define_split
   operands[2] = gen_lowpart (QImode, operands[2]);
 })
 
-(define_insn_and_split "*andndi3_doubleword"
+(define_insn_and_split "*andndi3_doubleword_bmi"
   [(set (match_operand:DI 0 "register_operand" "=r")
(and:DI
  (not:DI (match_operand:DI 1 "register_operand" "r"))
@@ -8551,6 +8551,27 @@ (define_insn_and_split "*andndi3_doublew
  (clobber (reg:CC FLAGS_REG))])]
   "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
 
+;; This insn is there only so that the stv pass can use pandn even when
+;; BMI is not available.  It should be matched during combine and split
+;; again in split1, unless the stv pass converts it.
+(define_insn_and_split "*andndi3_doubleword"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (and:DI
+ (not:DI (match_operand:DI 1 "register_operand" "r"))
+ (match_operand:DI 2 "nonimmediate_operand" "rm")))
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(set (match_dup 3)
+   (not:DI (match_dup 1)))
+   (parallel [(set (match_dup 0)
+  (and:DI (match_dup 3)
+  (match_dup 2)))
+ (clobber (reg:CC FLAGS_REG))])]
+  "operands[3] = gen_reg_rtx (DImode);")
+
 (define_insn "*andn_1"
   [(set (match_operand:SWI48 0 "register_operand" "=r,r")
(and:SWI48
@@ -9313,8 +9334,8 @@ (define_split
 ;; One complement instructions
 
 (define_expand "one_cmpl2"
-  [(set (match_operand:SWIM 0 "nonimmediate_operand")
-   (not:SWIM (match_oper

[PATCH] OpenACC executable directives

2016-12-02 Thread Cesar Philippidis

This patch teaches the C and C++ FEs to expect ACC ENTER/EXIT DATA, ACC
UPDATE and ACC WAIT executable directives to be used inside compound
statements. This is to prevent situations such as

  if (needs_wait)
#pragma acc wait

  // do something else here

from generating unexpected code when the program is built without
-fopenacc. The C and C++ FEs already guard against such usages for
OpenMP executable directives.

This patch also included a tweak for the diagnostics generated by
oacc_enter_exit_data parser. The error message now more accurately
reports if an error is detected in an enter or exit data directive.

Is this patch OK for trunk?

Cesar

Re: [PATCH] OpenACC executable directives

2016-12-02 Thread Cesar Philippidis

On 12/02/2016 06:37 AM, Cesar Philippidis wrote:
> This patch teaches the C and C++ FEs to expect ACC ENTER/EXIT DATA, ACC
> UPDATE and ACC WAIT executable directives to be used inside compound
> statements. This is to prevent situations such as
> 
>   if (needs_wait)
> #pragma acc wait
> 
>   // do something else here
> 
> from generating unexpected code when the program is built without
> -fopenacc. The C and C++ FEs already guard against such usages for
> OpenMP executable directives.
> 
> This patch also included a tweak for the diagnostics generated by
> oacc_enter_exit_data parser. The error message now more accurately
> reports if an error is detected in an enter or exit data directive.
> 
> Is this patch OK for trunk?

And here's the patch.

Cesar

2016-12-02  Cesar Philippidis  
	James Norris  

	gcc/c/
	* c-parser.c (c_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA,
	EXIT_DATA,WAIT} are not used in compound statements.
	(c_parser_oacc_enter_exit_data): Update diagnostics.

	gcc/cp/
	* parser.c (cp_parser_oacc_enter_exit_data): Update diagnostics.
	(cp_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA,
	EXIT_DATA,WAIT} are not used in compound statements.

	gcc/testsuite/
	* c-c++-common/goacc/data-2.c: Adjust test.
	* c-c++-common/goacc/executeables-1.c: New test.
	* g++.dg/goacc/data-1.C: Adjust test.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 00fe731..f7bf9c4 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -10145,10 +10145,24 @@ c_parser_pragma (c_parser *parser, enum pragma_context context, bool *if_p)
   return false;
 
 case PRAGMA_OACC_ENTER_DATA:
+  if (context != pragma_compound)
+	{
+	  if (context == pragma_stmt)
+	c_parser_error (parser, "%<#pragma acc enter data%> may only be "
+			"used in compound statements");
+	  goto bad_stmt;
+	}
   c_parser_oacc_enter_exit_data (parser, true);
   return false;
 
 case PRAGMA_OACC_EXIT_DATA:
+  if (context != pragma_compound)
+	{
+	  if (context == pragma_stmt)
+	c_parser_error (parser, "%<#pragma acc exit data%> may only be "
+			"used in compound statements");
+	  goto bad_stmt;
+	}
   c_parser_oacc_enter_exit_data (parser, false);
   return false;
 
@@ -10305,6 +10319,16 @@ c_parser_pragma (c_parser *parser, enum pragma_context context, bool *if_p)
   c_parser_cilk_grainsize (parser, if_p);
   return false;
 
+case PRAGMA_OACC_WAIT:
+  if (context != pragma_compound)
+	{
+	  if (context == pragma_stmt)
+	c_parser_error (parser, "%<#pragma acc enter data%> may only be "
+			"used in compound statements");
+	  goto bad_stmt;
+	}
+	/* FALL THROUGH.  */
+
 default:
   if (id < PRAGMA_FIRST_EXTERNAL)
 	{
@@ -13871,28 +13895,26 @@ c_parser_oacc_enter_exit_data (c_parser *parser, bool enter)
 {
   location_t loc = c_parser_peek_token (parser)->location;
   tree clauses, stmt;
+  const char *p = "";
 
   c_parser_consume_pragma (parser);
 
-  if (!c_parser_next_token_is (parser, CPP_NAME))
+  if (c_parser_next_token_is (parser, CPP_NAME))
 {
-  c_parser_error (parser, enter
-		  ? "expected % in %<#pragma acc enter data%>"
-		  : "expected % in %<#pragma acc exit data%>");
-  c_parser_skip_to_pragma_eol (parser);
-  return;
+  p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
+  c_parser_consume_token (parser);
 }
 
-  const char *p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
   if (strcmp (p, "data") != 0)
 {
-  c_parser_error (parser, "invalid pragma");
+  error_at (loc, enter
+		? "expected % after %<#pragma acc enter%>"
+		: "expected % after %<#pragma acc exit%>");
+  parser->error = true;
   c_parser_skip_to_pragma_eol (parser);
   return;
 }
 
-  c_parser_consume_token (parser);
-
   if (enter)
 clauses = c_parser_oacc_all_clauses (parser, OACC_ENTER_DATA_CLAUSE_MASK,
 	 "#pragma acc enter data");
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 843cbe2..08f5f9e 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -36175,23 +36175,18 @@ static tree
 cp_parser_oacc_enter_exit_data (cp_parser *parser, cp_token *pragma_tok,
 bool enter)
 {
+  location_t loc = pragma_tok->location;
   tree stmt, clauses;
+  const char *p = "";
 
-  if (cp_lexer_next_token_is (parser->lexer, CPP_PRAGMA_EOL)
- || cp_lexer_next_token_is_not (parser->lexer, CPP_NAME))
-{
-  cp_parser_error (parser, enter
-		   ? "expected % in %<#pragma acc enter data%>"
-		   : "expected % in %<#pragma acc exit data%>");
-  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
-  return NULL_TREE;
-}
+  if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
+p = IDENTIFIER_POINTER (cp_lexer_peek_token (parser->lexer)->u.value);
 
-  const char *p =
-IDENTIFIER_POINTER (cp_lexer_peek_token (parser->lexer)->u.value);
   if (strcmp (p, "data") != 0)
 {
-  cp_parser_error (parser, "invalid pragma");
+  error_at (loc,

Re: [Patch 0/2 PR78561] Recalculate constant pool size before emitting it

2016-12-02 Thread James Greenhalgh

On Thu, Dec 01, 2016 at 12:55:21PM -0500, David Edelsohn wrote:
> > James Greenhalgh writes:
> 
> > The patch set has been bootstrapped and tested on aarch64-none-linux-gnu and
> > x86-64-none-linux-gnu without any issues. I've also cross-tested it for
> > aarch64-none-elf and build-tested it for rs6000 (though I couldn't run the
> > testsuite as I don't have a test environment).
> 
> There are PPC64 Linux and AIX systems in the GNU Compile Farm.  All
> have DejaGNU installed.

I bootstrapped and tested a compiler with these patches applied on
gcc112, there were no issues.

I've applied the patches based on Jeff's OK as revisions 243182, 243183.

Thanks,
James

Re: [PATCH 8a/9] Introduce class function_reader (v6)

2016-12-02 Thread Bernd Schmidt


On 12/02/2016 03:00 AM, David Malcolm wrote:

Changed in v6:
- split out dump-reading selftests into followup patches
  (target-independent, and target-specific)
- removes unneeded headers from read-rtl-function.c
- numerous other cleanups identified in review


Ok, starting to look very close to done.

It appears I accidentally made you change element->directive. It seems 
like a good change but all I wanted to point out was a plural where I 
would have expected a singular.




+  /* Lookup param by name.  */
+  tree t_param = parse_mem_expr (name);
+  // TODO: what if not found?


Error out?


+/* Implementation of rtx_reader::handle_unknown_directive.
+
+   Require a top-level "function" directive, as emitted by
+   print_rtx_function, and parse it.  */


Should document START_LOC and NAME.


+  /* Do we need this to force cgraphunit.c to output the function? */


Well, what happens if you don't do this? Seems reasonable anyway, so 
maybe lose the comment.



+/* Parse rtx instructions by calling read_rtx_code, calling
+   set_first_insn and set_last_insn as appropriate, and
+   adding the insn to the insn chain.
+   Consume the trailing ')'.  */
+
+rtx_insn *
+function_reader::parse_insn (file_location start_loc, const char *name)


Once again, please document args - make another pass over all the 
functions to make sure.



+
+/* Parse operand IDX of X, of code 'i' or 'n'.


"... as specified by FORMAT_CHAR", presumably.


+  if (0 == strncmp (desc, "orig:", 5))
+   {
+ expect_original_regno = true;
+ desc_start += 5;
+ /* Skip to any whitespace following the integer.  */
+ const char *space = strchr (desc_start, ' ');
+ if (space)
+   desc_start = space + 1;
+   }
+  /* Any remaining text may be the REG_EXPR.  Alternatively we have
+no REG_ATTRS, and instead we have ORIGINAL_REGNO.  */
+  if (ISDIGIT (*desc_start))
+   {
+ /* Assume we have ORIGINAL_REGNO.  */
+ ORIGINAL_REGNO (x) = atoi (desc_start);
+   }


This looks like an issue. You'll want to have ORIGINAL_REGNOs printed 
and parsed like other regnos, i.e. with %number for pseudos. Can be done 
as a followup if it's involved.



+  /* Verify lookup of hard registers.  */
+#ifdef GCC_AARCH64_H
+  ASSERT_EQ (0, lookup_reg_by_dump_name ("x0"));
+  ASSERT_EQ (1, lookup_reg_by_dump_name ("x1"));
+#endif
+#ifdef I386_OPTS_H
+  ASSERT_EQ (0, lookup_reg_by_dump_name ("ax"));
+  ASSERT_EQ (1, lookup_reg_by_dump_name ("dx"));
+#endif


Please just remove this for now; not important enough to break the rule 
about target-specific bits in generic code.



@@ -1103,13 +1244,39 @@ rtx_reader::read_rtx_code (const char *code_name)
   rtx value;   /* Value of this node.  */
 };

+  one_time_initialization ();


I still don't understand why this isn't in the rtx_reader constructor? 
Seems pointless to call this each time we want to parse an rtx.



+
+  /* Use the format_ptr to parse the various operands of this rtx.
+ read_rtx_operand is a vfunc, allowing the parser to vary between
+ parsing .md files and parsing .rtl dumps; the function_reader subclass
+ overrides the defaults when loading RTL dumps, to handle the
+ various extra attributes emitted by print_rtx.  */
   for (int idx = 0; format_ptr[idx] != 0; idx++)
-read_rtx_operand (return_rtx, idx);
+return_rtx = read_rtx_operand (return_rtx, idx);
+
+  /* Call a vfunc to handle the various additional information that
+ print-rtl.c can write after the regular fields; does nothing when
+ parsing .md files.  */
+  handle_any_trailing_information (return_rtx);


I still think just drop the bulk of these comments. Virtual functions 
aren't that arcane.



+   /* "stringbuf" was allocated within string_obstack and thus has
+  the its lifetime restricted to that of the rtx_reader.  This is
+  OK for the generator programs, but for non-generator programs,
+  XSTR and XTMPL fields are meant to be allocated in the GC-managed
+  heap.  Hence we need to allocate a copy in the GC-managed heap
+  for the non-generator case.  */
+   const char *string_ptr = finalize_string (stringbuf);


Put the comment near the implementation that makes the copy (and maybe 
one explaining why the no-op in the other implementation), not here.



Bernd

Re: [PATCH] OpenACC executable directives

2016-12-02 Thread Jakub Jelinek

On Fri, Dec 02, 2016 at 06:38:25AM -0800, Cesar Philippidis wrote:
> 2016-12-02  Cesar Philippidis  
>   James Norris  
> 
>   gcc/c/
>   * c-parser.c (c_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA,
>   EXIT_DATA,WAIT} are not used in compound statements.
>   (c_parser_oacc_enter_exit_data): Update diagnostics.
> 
>   gcc/cp/
>   * parser.c (cp_parser_oacc_enter_exit_data): Update diagnostics.
>   (cp_parser_pragma): Error when PRAGMA_OACC_{ENTER_DATA,
>   EXIT_DATA,WAIT} are not used in compound statements.
> 
>   gcc/testsuite/
>   * c-c++-common/goacc/data-2.c: Adjust test.
>   * c-c++-common/goacc/executeables-1.c: New test.
>   * g++.dg/goacc/data-1.C: Adjust test.

Ok, thanks.

Jakub

Re: [PATCH] Unset used bit in simplify_replace_* on newly copied rtxs (PR target/78614)

2016-12-02 Thread Bernd Schmidt


On 12/02/2016 03:12 PM, Jakub Jelinek wrote:

--- gcc/rtl.c.jj2016-10-31 13:28:12.0 +0100
+++ gcc/rtl.c   2016-12-02 11:01:12.557553040 +0100
@@ -318,10 +318,6 @@ copy_rtx (rtx orig)
  us to explicitly document why we are *not* copying a flag.  */
   copy = shallow_copy_rtx (orig);

-  /* We do not copy the USED flag, which is used as a mark bit during
- walks over the RTL.  */
-  RTX_FLAG (copy, used) = 0;
-
   format_ptr = GET_RTX_FORMAT (GET_CODE (copy));

   for (i = 0; i < GET_RTX_LENGTH (GET_CODE (copy)); i++)
@@ -367,7 +363,29 @@ shallow_copy_rtx_stat (const_rtx orig ME
 {
   const unsigned int size = rtx_size (orig);
   rtx const copy = ggc_alloc_rtx_def_stat (size PASS_MEM_STAT);
-  return (rtx) memcpy (copy, orig, size);
+  memcpy (copy, orig, size);
+  switch (GET_CODE (orig))
+{
+  /* RTX codes copy_rtx_if_shared_1 considers are shareable,
+the used flag is often used for other purposes.  */
+case REG:
+case DEBUG_EXPR:
+case VALUE:
+CASE_CONST_ANY:
+case SYMBOL_REF:
+case CODE_LABEL:
+case PC:
+case CC0:
+case RETURN:
+case SIMPLE_RETURN:
+case SCRATCH:
+  break;
+default:
+  /* For all other RTXes clear the used flag on the copy.  */
+  RTX_FLAG (copy, used) = 0;
+  break;
+}
+  return copy;
 }


I like this a lot better. Of course now that it's spelled out it seems 
like several of these (PC, CC0, RETURN, maybe SCRATCH) should never be 
passed to shallow_copy_rtx and maybe a checking_assert to that effect 
might be in order. This part is OK.



 /* Nonzero when we are generating CONCATs.  */
--- gcc/simplify-rtx.c.jj   2016-12-02 00:15:09.200779256 +0100
+++ gcc/simplify-rtx.c  2016-12-02 10:55:24.283989673 +0100
@@ -547,13 +547,19 @@ simplify_replace_fn_rtx (rtx x, const_rt
  old_rtx, fn, data);
if (op != RTVEC_ELT (vec, j))
  {
-   if (newvec == vec)
+   if (x == newx)
  {
-   newvec = shallow_copy_rtvec (vec);
-   if (x == newx)
- newx = shallow_copy_rtx (x);
-   XVEC (newx, i) = newvec;
+   newx = shallow_copy_rtx (x);
+   /* If we copy X, we need to copy also all
+  vectors in it, rather than copy only
+  a subset of them and share the rest.  */
+   for (int k = 0; fmt[k]; k++)
+ if (fmt[k] == 'E')
+   XVEC (newx, k) = shallow_copy_rtvec (XVEC (x, k));
+   newvec = XVEC (newx, i);
  }
+   else
+ gcc_checking_assert (vec != newvec);
RTVEC_ELT (newvec, j) = op;
  }
  }
@@ -566,7 +572,15 @@ simplify_replace_fn_rtx (rtx x, const_rt
if (op != XEXP (x, i))
  {
if (x == newx)
- newx = shallow_copy_rtx (x);
+ {
+   newx = shallow_copy_rtx (x);
+   /* If we copy X, we need to copy also all
+  vectors in it, rather than copy only
+  a subset of them and share the rest.  */
+   for (int k = 0; fmt[k]; k++)
+ if (fmt[k] == 'E')
+   XVEC (newx, k) = shallow_copy_rtvec (XVEC (x, k));
+ }
XEXP (newx, i) = op;
  }
  }


After looking at it more, I feel that here as well it seems strange for 
simplify_replace_fn_rtx to have knowledge about these issues. Doesn't 
this belong in shallow_copy_rtx as well?



--- gcc/emit-rtl.c.jj   2016-12-02 09:43:19.0 +0100
+++ gcc/emit-rtl.c  2016-12-02 10:56:37.001061044 +0100
@@ -5552,10 +5552,6 @@ copy_insn_1 (rtx orig)
  us to explicitly document why we are *not* copying a flag.  */
   copy = shallow_copy_rtx (orig);

-  /* We do not copy the USED flag, which is used as a mark bit during
- walks over the RTL.  */
-  RTX_FLAG (copy, used) = 0;
-
   /* We do not copy JUMP, CALL, or FRAME_RELATED for INSNs.  */
   if (INSN_P (orig))
 {
--- gcc/valtrack.c.jj   2016-10-31 13:28:06.0 +0100
+++ gcc/valtrack.c  2016-12-02 11:01:44.690144705 +0100
@@ -119,10 +119,6 @@ cleanup_auto_inc_dec (rtx src, machine_m
  us to explicitly document why we are *not* copying a flag.  */
   x = shallow_copy_rtx (x);

-  /* We do not copy the USED flag, which is used as a mark bit during
- walks over the RTL.  */
-  RTX_FLAG (x, used) = 0;
-
   /* We do not copy FRAME_RELATED for INSNs.  */
   if (INSN_P (x))
 RTX_FLAG (x, frame_related) = 0;
--- gcc/config/rs6000/rs6000.c.jj   2016-12-01 08:56:27.137105707 +0100
+++ gcc/config/rs6000/rs6000.c  2016-12-02 11:02:14.796762115 +0100
@@ -27186,7 +27186,6 @@ rs6000_frame_related (rtx_insn *insn, rt
 {
   pat = shallow_copy_rtx (pat);

Re: [PATCH 8b/9] Add target-independent selftests of RTL function reader

2016-12-02 Thread Bernd Schmidt

Conceptually, the concept of "target-independent RTL" seems weird to me. 
But I guess you expect these to work because the backend never gets 
involved trying to recognize insns or addressing modes? I think a 
comment to that effect somewhere near the C code to load these tests 
would be good.


I think this potentially suggests a missing feature: the dump should 
indicate whether the reader should leave it alone or ask the target to 
verify it by recognizing the insns.


Also,


+  (edge-from entry (flags "FALLTHRU"))
+  (cinsn 1 (set (reg:DI %2)
+(lshiftrt:DI (reg:DI %0)
+(const_int 32)))
+"../../src/gcc/testsuite/gcc.dg/asr_div1.c":14


Weren't we going to make file/line information optional?

Otherwise this is probably going to be OK since I got past my "this is 
all wrong" initial reaction. You'll want to Cc target maintainers for 
the other two patches.



Bernd

[Patch, Fortran, OOP] PR 42188: F03:C612. The leftmost part-name shall be the name of a data object

2016-12-02 Thread Janus Weil

Hi all,

another simple fix for a rather old PR. This one adds a new check, in
order to provide better error messages than just "Unclassifiable
statement".

Regtests cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus



2016-12-02  Janus Weil  

PR fortran/42188
* primary.c (gfc_match_rvalue): Add a new check that gives better error
messages.

2016-12-02  Janus Weil  

PR fortran/42188
* gfortran.dg/derived_result_2.f90.f90: New test case.
Index: gcc/fortran/primary.c
===
--- gcc/fortran/primary.c   (revision 243176)
+++ gcc/fortran/primary.c   (working copy)
@@ -3298,6 +3298,15 @@ gfc_match_rvalue (gfc_expr **result)
   if (sym->result == NULL)
sym->result = sym;
 
+  gfc_gobble_whitespace ();
+  /* F08:C612.  */
+  if (gfc_peek_ascii_char() == '%')
+   {
+ gfc_error ("The leftmost part-ref in a data-ref can not be a "
+"function reference at %C");
+ m = MATCH_ERROR;
+   }
+
   m = MATCH_YES;
   break;
 
! { dg-do compile }
!
! PR 42188: [OOP] F03:C612. The leftmost part-name shall be the name of a data object
!
! Contributed by Janus Weil 

module grid_module
 implicit none
 type grid
 contains
   procedure :: new_grid
   procedure :: new_int
 end type
contains
 subroutine new_grid(this)
   class(grid) :: this
 end subroutine
 integer function new_int(this)
   class(grid) :: this
   new_int = 42
 end function
end module

module field_module
 use grid_module
 implicit none

 type field
   type(grid) :: mesh
 end type

contains

 type(field) function new_field()
 end function

 subroutine test
   integer :: i
   type(grid) :: g
   g = new_field()%mesh  ! { dg-error "can not be a function reference" }
   call new_field()%mesh%new_grid()  ! { dg-error "Syntax error" }
   i = new_field() % mesh%new_int()  ! { dg-error "can not be a function reference" }
 end subroutine

end module

Re: [PATCH 8/9] Introduce class function_reader (v4)

2016-12-02 Thread Bernd Schmidt


On 12/01/2016 10:43 PM, David Malcolm wrote:

+  rtx_insn *jump_insn = get_insn_by_uid (1);
+  ASSERT_EQ (JUMP_INSN, GET_CODE (jump_insn));
+  ASSERT_EQ (ret_rtx, JUMP_LABEL (jump_insn));
+  // FIXME: ^^^ use ASSERT_RTX_PTR_EQ here ^^^


Why is this a fixme and not just done that way (several instances)?


ASSERT_RTX_PTR_EQ doesn't exist yet; there was some discussion about it
in earlier versions of the patch, but I haven't written it.  It would
be equivalent to ASSERT_EQ, checking pointer equality, but would dump
the mismatching expected vs actual rtx on failure.

Should I convert this to a TODO, or go ahead and implement
ASSERT_RTX_PTR_EQ?


Missed this question. Just add ASSERT_RTX_PTR_EQ, that shouldn't be 
hard, should it?



Bernd

Re: [PATCH 1/9] print_rtx: implement support for reuse IDs (v2)

2016-12-02 Thread Bernd Schmidt


On 12/02/2016 02:37 AM, David Malcolm wrote:

Here's the current status of the kit:

"[PATCH 1/9] print_rtx: implement support for reuse IDs (v2)"
  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01188.html
Still needs review


Whoops, I thought this was settled. Ok for the new version.


Bernd

[PATCH, Fortran, pr78505, v1] [F08] Coarray source allocation not synchronizing on oversubscribed cores

2016-12-02 Thread Andre Vehreschild

Hi all,

attached patch adds a call to sync_all after an ALLOCATE allocating a coarray.
This is to adhere to standard wanting:

..., execution of the segment (11.6.2) following the statement is delayed until
all other active images in the current team have executed the same statement the
same number of times in this team.

This, esp. the "statement", means that assigning the SOURCE= is to come before
the sync all, which means we have to do it explicitly after that assignment. Or
the other way around the sync all can be done in library caf_register().

Bootstraps and regtests ok on x86_64-linux/F23. Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/fortran/ChangeLog:

2016-12-02  Andre Vehreschild  

PR fortran/78505
* trans-stmt.c (gfc_trans_allocate): Add sync all after the execution
of the whole allocate-statement to adhere to the standard.

gcc/testsuite/ChangeLog:

2016-12-02  Andre Vehreschild  

PR fortran/78505
* gfortran.dg/coarray_alloc_with_implicit_sync_1.f90: New test.


diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 514db28..2219bcc 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -5506,7 +5506,7 @@ gfc_trans_allocate (gfc_code * code)
   stmtblock_t block;
   stmtblock_t post;
   tree nelems;
-  bool upoly_expr, tmp_expr3_len_flag = false, al_len_needs_set;
+  bool upoly_expr, tmp_expr3_len_flag = false, al_len_needs_set, is_coarray ;
   gfc_symtree *newsym = NULL;
 
   if (!code->ext.alloc.list)
@@ -5516,6 +5516,7 @@ gfc_trans_allocate (gfc_code * code)
   expr3 = expr3_vptr = expr3_len = expr3_esize = NULL_TREE;
   label_errmsg = label_finish = errmsg = errlen = NULL_TREE;
   e3_is = E3_UNSET;
+  is_coarray = false;
 
   gfc_init_block (&block);
   gfc_init_block (&post);
@@ -,8 +5556,9 @@ gfc_trans_allocate (gfc_code * code)
  expression.  */
   if (code->expr3)
 {
-  bool vtab_needed = false, temp_var_needed = false,
-	  is_coarray = gfc_is_coarray (code->expr3);
+  bool vtab_needed = false, temp_var_needed = false;
+
+  is_coarray = gfc_is_coarray (code->expr3);
 
   /* Figure whether we need the vtab from expr3.  */
   for (al = code->ext.alloc.list; !vtab_needed && al != NULL;
@@ -6093,6 +6095,9 @@ gfc_trans_allocate (gfc_code * code)
 	  tree caf_decl, token;
 	  gfc_se caf_se;
 
+	  /* Set flag, to add synchronize after the allocate.  */
+	  is_coarray = true;
+
 	  gfc_init_se (&caf_se, NULL);
 
 	  caf_decl = gfc_get_tree_for_caf_expr (expr);
@@ -6114,6 +6119,11 @@ gfc_trans_allocate (gfc_code * code)
 	}
   else
 	{
+	  /* Allocating coarrays needs a sync after the allocate executed.
+	 Set the flag to add the sync after all objects are allocated.  */
+	  is_coarray = is_coarray || (gfc_caf_attr (expr).codimension
+  && flag_coarray == GFC_FCOARRAY_LIB);
+
 	  if (expr->ts.type == BT_CHARACTER && al_len != NULL_TREE
 	  && expr3_len != NULL_TREE)
 	{
@@ -6357,6 +6367,15 @@ gfc_trans_allocate (gfc_code * code)
   gfc_add_modify (&block, se.expr, tmp);
 }
 
+  if (is_coarray && flag_coarray == GFC_FCOARRAY_LIB)
+{
+  /* Add a sync all after the allocation has been executed.  */
+  tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_all,
+ 3, null_pointer_node, null_pointer_node,
+ integer_zero_node);
+  gfc_add_expr_to_block (&post, tmp);
+}
+
   gfc_add_block_to_block (&block, &se.post);
   gfc_add_block_to_block (&block, &post);
 
diff --git a/gcc/testsuite/gfortran.dg/coarray_alloc_with_implicit_sync_1.f90 b/gcc/testsuite/gfortran.dg/coarray_alloc_with_implicit_sync_1.f90
new file mode 100644
index 000..1dbbcb7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_alloc_with_implicit_sync_1.f90
@@ -0,0 +1,10 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original -fcoarray=lib" }
+! Check that allocating a coarray adds an implicit sync all.
+ 
+ implicit none
+ integer, allocatable :: f(:)[:]
+ allocate( f(20)[*], source = 1 )
+end
+
+! { dg-final { scan-tree-dump-times "_gfortran_caf_sync_all \\(" 1 "original" } }

Re: [PATCH] PR fortran/78618 -- RANK() should not ICE

2016-12-02 Thread Janus Weil

Hi Steve,

2016-12-02 2:33 GMT+01:00 Steve Kargl :
> The attached patch fixes an ICE, a nearby whitespace issue, and
> removed the ATTRIBUTE_UNUSED tag.  THe change has passed regression
> testing on x86_64-*-freebsd.  Ok to commit?

huh, I don't really understand why the argument of RANK is detected to
be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be
an EXPR_CONSTANT?

Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed
is the symbol "c" (as expected), but that clearly is not a function,
so it seems to me that the actual bug here is that a->expr_type is set
incorrectly ...?

Cheers,
Janus




> 2016-12-01  Steven G. Kargl  
>
> PR fortran/78618
> * check.c (gfc_check_rank): Remove ATTRIBUTE_UNUSED.  Fix whitespace.
> Fix ICE where a NULL pointer is dereferenced.
> * simplify.c (gfc_convert_char_constant):  Do not buffer error.
>
> 2016-12-01  Steven G. Kargl  
>
> PR fortran/78618
> * gfortran.dg/char_conversion.f90: New test.
>
> --
> Steve

Re: [PR 70965] Schedule extra pass_rebuild_cgraph_edges

2016-12-02 Thread Martin Jambor

Hi,

On Fri, Nov 25, 2016 at 02:10:51PM +0100, Jan Hubicka wrote:
> > On Thu, Nov 24, 2016 at 5:44 PM, Martin Jambor  wrote:
> > >
> > > ...
> > >
> > > 2016-11-24  Martin Jambor  
> > >
> > > gcc/
> > > * passes.def (pass_build_ssa_passes): Add 
> > > pass_rebuild_cgraph_edges.
> > >
> > > gcc/testsuite/
> > > * g++.dg/pr70965.C: New test.
> > > ---
> > >  gcc/passes.def |  1 +
> > >  gcc/testsuite/g++.dg/pr70965.C | 21 +
> > >  2 files changed, 22 insertions(+)
> > >  create mode 100644 gcc/testsuite/g++.dg/pr70965.C
> > >
> > > diff --git a/gcc/passes.def b/gcc/passes.def
> > > index 85a5af0..5faf17f 100644
> > > --- a/gcc/passes.def
> > > +++ b/gcc/passes.def
> > > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
> > >NEXT_PASS (pass_build_ssa_passes);
> > >PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
> > >NEXT_PASS (pass_fixup_cfg);
> > > +  NEXT_PASS (pass_rebuild_cgraph_edges);
> > >NEXT_PASS (pass_build_ssa);
> > >NEXT_PASS (pass_warn_nonnull_compare);
> > >NEXT_PASS (pass_ubsan);
> 
> Actually you want to rebuild at the end of pass_build_ssa_passes passes queue.
> This may still trip an ICE if one of passes bellow modify CFG (which 
> pass_nothorw
> does)
> 
> Path is OK with that change.

Well, I have already committed the patch as it was.  But given the
above, I will commit the following to trunk after bootstrapping and
testing.

Thanks,

Martin


2016-12-02  Martin Jambor  

* passes.def: Move pass_rebuild_cgraph_edges to the end of
pass_build_ssa_passes.
---
 gcc/passes.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index 1117b8b..7b12a41 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -56,12 +56,12 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_build_ssa_passes);
   PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
   NEXT_PASS (pass_fixup_cfg);
-  NEXT_PASS (pass_rebuild_cgraph_edges);
   NEXT_PASS (pass_build_ssa);
   NEXT_PASS (pass_warn_nonnull_compare);
   NEXT_PASS (pass_ubsan);
   NEXT_PASS (pass_early_warn_uninitialized);
   NEXT_PASS (pass_nothrow);
+  NEXT_PASS (pass_rebuild_cgraph_edges);
   POP_INSERT_PASSES ()
 
   NEXT_PASS (pass_chkp_instrumentation_passes);
-- 
2.10.2

Re: [PATCH] correct handling of non-constant width and precision (pr 78521)

2016-12-02 Thread Martin Sebor


On 12/02/2016 01:31 AM, Rainer Orth wrote:

Hi Martin,


PR 78521 notes that the gimple-ssa-sprintf pass doesn't do the right
thing (i.e., the -Wformat-length and -fprintf-return-value options
behave incorrectly) when a conversion specification includes a width
or precision with a non-constant value.  The code treats such cases
as if they were not provided which is incorrect and results in
the wrong bytes counts in warning messages and in the wrong ranges
being generated for such calls (or in the case sprintf(0, 0, ...)
for some such calls being eliminated).

The attached patch corrects the handling of these cases, plus a couple
of other edge cases in the same area: it adjusts the parser to accept
precision in the form of just a period with no asterisk or decimal
digits after it (this sets the precision to zero), and corrects the
handling of zero precision and zero argument in integer directives
to produce no bytes on output.

Finally, the patch also tightens up the constraint on the upper bound
of bounded functions like snprintf to be INT_MAX.  The functions cannot
produce output in excess of INT_MAX + 1 bytes and some implementations
(e.g., Solaris) fail with EINVAL when the bound is INT_MAX or more.
This is the subject of PR 78520.


this patch broke Solaris bootstrap:

/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function 'void 
{anonymous}::get_width_and_precision(const {anonymous}::conversion_spec&, long 
long int*, long long int*)':
/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: error: call of 
overloaded 'abs(long long int)' is ambiguous
  width = abs (tree_to_shwi (spec.star_width));
 ^
/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:777:45: note: candidates 
are:
In file included from /usr/include/stdlib.h:12:0,
 from /vol/gcc/src/hg/trunk/local/gcc/system.h:258,
 from /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:49:
/usr/include/iso/stdlib_iso.h:205:16: note: long int std::abs(long int)
  inline long   abs(long _l) { return labs(_l); }
^
/usr/include/iso/stdlib_iso.h:160:12: note: int std::abs(int)
 extern int abs(int);
^

The following patch fixed this for me, but I've no idea if it's right.
It bootstrapped successfully on sparc-sun-solaris2.12,
i386-pc-solaris2.12, and x86_64-pc-linux-gnu.


Thanks for the heads up!  I just looked at that code yesterday while
analyzing bug 78608, wondering if it was safe.  Now I know it isn't.
I think it might be best to simply hand code the expression instead
of taking a chance on abs.  Let me take care of it today along with 78608.

Martin

PR78634: ifcvt/i386 cost updates

2016-12-02 Thread Bernd Schmidt

With the i386 backend no longer double-counting the cost of a SET, the 
default implementation default_max_noce_ifcvt_seq_cost now provides too 
high a bound for if conversion, allowing very costly substitutions.


The following patch fixes this by making an i386 variant of the hook, 
but first - James, do you recall how you arrived at the COSTS_N_INSNS(3) 
magic number? What happens on arm if you lower that in the default hook?


The change in ifcvt.c seems to have no bearing on the PR, I just noticed 
we were missing a cost check in one place.


Lightly tested so far, will bootstrap & test on x86_64-linux.


Bernd
	PR rtl-optimization/78634
	* config/i386/i386.c (ix86_max_noce_ifcvt_seq_cost): New function.
	(TARGET_MAX_NOCE_IFCVT_SEQ_COST): Define.
	* ifcvt.c (noce_try_cmove): Add missing cost check.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 242958)
+++ gcc/config/i386/i386.c	(working copy)
@@ -50301,6 +50301,28 @@ ix86_spill_class (reg_class_t rclass, ma
   return NO_REGS;
 }
 
+/* Implement TARGET_MAX_NOCE_IFCVT_SEQ_COST.  Like the default implementation,
+   but returns a lower bound.  */
+
+static unsigned int
+ix86_max_noce_ifcvt_seq_cost (edge e)
+{
+  bool predictable_p = predictable_edge_p (e);
+
+  enum compiler_param param
+= (predictable_p
+   ? PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST
+   : PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST);
+
+  /* If we have a parameter set, use that, otherwise take a guess using
+ BRANCH_COST.  */
+  if (global_options_set.x_param_values[param])
+return PARAM_VALUE (param);
+  else
+return BRANCH_COST (true, predictable_p) * COSTS_N_INSNS (2);
+}
+
+
 /* Implement targetm.vectorize.init_cost.  */
 
 static void *
@@ -51615,6 +51637,8 @@ ix86_run_selftests (void)
 #undef TARGET_EXPAND_DIVMOD_LIBFUNC
 #define TARGET_EXPAND_DIVMOD_LIBFUNC ix86_expand_divmod_libfunc
 
+#undef TARGET_MAX_NOCE_IFCVT_SEQ_COST
+#define TARGET_MAX_NOCE_IFCVT_SEQ_COST ix86_max_noce_ifcvt_seq_cost
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests
Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c	(revision 242958)
+++ gcc/ifcvt.c	(working copy)
@@ -1826,7 +1826,7 @@ noce_try_cmove (struct noce_if_info *if_
 	noce_emit_move_insn (if_info->x, target);
 
 	  seq = end_ifcvt_sequence (if_info);
-	  if (!seq)
+	  if (!seq || !noce_conversion_profitable_p (seq, if_info))
 	return FALSE;
 
 	  emit_insn_before_setloc (seq, if_info->jump,

Re: [PATCH] Unset used bit in simplify_replace_* on newly copied rtxs (PR target/78614)

2016-12-02 Thread Jakub Jelinek

On Fri, Dec 02, 2016 at 03:51:52PM +0100, Bernd Schmidt wrote:
> I like this a lot better. Of course now that it's spelled out it seems like
> several of these (PC, CC0, RETURN, maybe SCRATCH) should never be passed to
> shallow_copy_rtx and maybe a checking_assert to that effect might be in
> order. This part is OK.

Ok, I've committed the parts except simplify-rtx.c.

> After looking at it more, I feel that here as well it seems strange for
> simplify_replace_fn_rtx to have knowledge about these issues. Doesn't this
> belong in shallow_copy_rtx as well?

Started working on that, but it seems e.g. copy_insn_1 doesn't want this
behavior, it has a complex system of sharing some of the subvectors
of ASM_OPERANDS.

Jakub

Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)

2016-12-02 Thread Uros Bizjak

On Fri, Dec 2, 2016 at 3:21 PM, Jakub Jelinek  wrote:
> Hi!
>
> While the https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01087.html
> added *andndi3_doubleword, I don't understand how it can actually work.
> The problem is that this pattern is for combine, and needs combining
> of DImode NOT setter with DImode AND user.  But there is no DImode
> pattern for one_cmpldi2, so it is lowered early into two one_cmplsi2
> patterns and I can't see how combine would be able to figure stuff out from
> that.
>
> This patch:
> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two
>one_cmplsi2 after reload)
> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones)
> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it
>is for -mbmi only, and adds another *andndi3_doubleword pattern that is
>meant to live just from combine till the stv pass, or worse case till
>following split1 pass when it is split back into not followed by and;
>this change makes it possible to use pandn in stv pass, even without
>-mbmi

Please use attached (lightly tested) patch to implement point 3)
above. The patch splits insn after reload, as is the case with all STV
patterns.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-12-02  Jakub Jelinek  
>
> PR target/70322
> * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle
> NOT.
> (dimode_scalar_chain::compute_convert_gain): Likewise.
> (dimode_scalar_chain::convert_insn): Likewise.
> * config/i386/i386.md (*andndi3_doubleword): New 
> define_insn_and_split.
> Rename the old define_insn_and_split to...
> (*andndi3_doubleword_bmi): ... this.
> (one_cmpl2): Use SWIM1248x iterator instead of SWIM.
> (*one_cmpldi2_doubleword): New define_insn_and_split.
>
> * gcc.target/i386/pr70322-1.c: New test.
> * gcc.target/i386/pr70322-2.c: New test.
> * gcc.target/i386/pr70322-3.c: New test.

Apart to andn patterns, the patch is OK with the small nit below.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-12-02 11:17:40.702995111 +0100
> +++ gcc/config/i386/i386.c  2016-12-02 12:01:31.656469089 +0100
> @@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx
> return false;
>break;
>
> +case NOT:
> +  break;
> +
>  case REG:
>return true;
>
> @@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx
>
>if ((GET_MODE (XEXP (src, 0)) != DImode
> && !CONST_INT_P (XEXP (src, 0)))
> -  || (GET_MODE (XEXP (src, 1)) != DImode
> +  || (GET_CODE (src) != NOT
> + && GET_MODE (XEXP (src, 1)) != DImode
>   && !CONST_INT_P (XEXP (src, 1
>  return false;
>
> @@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai
>   if (CONST_INT_P (XEXP (src, 1)))
> gain -= vector_const_cost (XEXP (src, 1));
> }
> +  else if (GET_CODE (src) == NOT)
> +   gain += ix86_cost->add - COSTS_N_INSNS (1);
>else if (GET_CODE (src) == COMPARE)
> {
>   /* Assume comparison cost is the same.  */
> @@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i
>PUT_MODE (src, V2DImode);
>break;
>
> +case NOT:
> +  src = XEXP (src, 0);
> +  convert_op (&src, insn);
> +  subreg = gen_reg_rtx (V2DImode);
> +  emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), 
> insn);
> +  src = gen_rtx_XOR (V2DImode, src, subreg);
> +  break;
> +
>  case MEM:
>if (!REG_P (dst))
> convert_op (&src, insn);
> --- gcc/config/i386/i386.md.jj  2016-12-01 23:24:51.663157486 +0100
> +++ gcc/config/i386/i386.md 2016-12-02 12:50:27.616829191 +0100
> @@ -8534,7 +8534,7 @@ (define_split
>operands[2] = gen_lowpart (QImode, operands[2]);
>  })
>
> -(define_insn_and_split "*andndi3_doubleword"
> +(define_insn_and_split "*andndi3_doubleword_bmi"
>[(set (match_operand:DI 0 "register_operand" "=r")
> (and:DI
>   (not:DI (match_operand:DI 1 "register_operand" "r"))
> @@ -8551,6 +8551,27 @@ (define_insn_and_split "*andndi3_doublew
>   (clobber (reg:CC FLAGS_REG))])]
>"split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
>
> +;; This insn is there only so that the stv pass can use pandn even when
> +;; BMI is not available.  It should be matched during combine and split
> +;; again in split1, unless the stv pass converts it.
> +(define_insn_and_split "*andndi3_doubleword"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +   (and:DI
> + (not:DI (match_operand:DI 1 "register_operand" "r"))
> + (match_operand:DI 2 "nonimmediate_operand" "rm")))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "!TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2
> +   && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 3)
> +   (not:DI (match_dup

Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)

2016-12-02 Thread Uros Bizjak

On Fri, Dec 2, 2016 at 5:11 PM, Uros Bizjak  wrote:
> On Fri, Dec 2, 2016 at 3:21 PM, Jakub Jelinek  wrote:
>> Hi!
>>
>> While the https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01087.html
>> added *andndi3_doubleword, I don't understand how it can actually work.
>> The problem is that this pattern is for combine, and needs combining
>> of DImode NOT setter with DImode AND user.  But there is no DImode
>> pattern for one_cmpldi2, so it is lowered early into two one_cmplsi2
>> patterns and I can't see how combine would be able to figure stuff out from
>> that.
>>
>> This patch:
>> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two
>>one_cmplsi2 after reload)
>> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones)
>> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it
>>is for -mbmi only, and adds another *andndi3_doubleword pattern that is
>>meant to live just from combine till the stv pass, or worse case till
>>following split1 pass when it is split back into not followed by and;
>>this change makes it possible to use pandn in stv pass, even without
>>-mbmi
>
> Please use attached (lightly tested) patch to implement point 3)
> above. The patch splits insn after reload, as is the case with all STV
> patterns.

Now attached for real.

Uros.
Index: i386.md
===
--- i386.md (revision 243185)
+++ i386.md (working copy)
@@ -8534,15 +8534,24 @@
   operands[2] = gen_lowpart (QImode, operands[2]);
 })
 
-(define_insn_and_split "*andndi3_doubleword"
-  [(set (match_operand:DI 0 "register_operand" "=r")
+(define_insn "*andndi3_doubleword"
+  [(set (match_operand:DI 0 "register_operand" "=r,&r")
(and:DI
- (not:DI (match_operand:DI 1 "register_operand" "r"))
- (match_operand:DI 2 "nonimmediate_operand" "rm")))
+ (not:DI (match_operand:DI 1 "register_operand" "r,0"))
+ (match_operand:DI 2 "nonimmediate_operand" "rm,rm")))
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2"
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
   "#"
-  "&& reload_completed"
+  [(set_attr "isa" "bmi,*")])
+
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+   (and:DI
+ (not:DI (match_operand:DI 1 "register_operand"))
+ (match_operand:DI 2 "nonimmediate_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_64BIT && TARGET_BMI && TARGET_STV && TARGET_SSE2
+   && reload_completed"
   [(parallel [(set (match_dup 0)
   (and:SI (not:SI (match_dup 1)) (match_dup 2)))
  (clobber (reg:CC FLAGS_REG))])
@@ -8551,6 +8560,24 @@
  (clobber (reg:CC FLAGS_REG))])]
   "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
 
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+   (and:DI
+ (not:DI (match_dup 0))
+ (match_operand:DI 1 "nonimmediate_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2
+   && reload_completed"
+  [(set (match_dup 0) (not:SI (match_dup 0)))
+   (parallel [(set (match_dup 0)
+  (and:SI (match_dup 0) (match_dup 1)))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 2) (not:SI (match_dup 2)))
+   (parallel [(set (match_dup 2)
+  (and:SI (match_dup 2) (match_dup 3)))
+ (clobber (reg:CC FLAGS_REG))])]
+  "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);")
+
 (define_insn "*andn_1"
   [(set (match_operand:SWI48 0 "register_operand" "=r,r")
(and:SWI48

Re: [PR 70965] Schedule extra pass_rebuild_cgraph_edges

2016-12-02 Thread Jan Hubicka

> Hi,
> 
> On Fri, Nov 25, 2016 at 02:10:51PM +0100, Jan Hubicka wrote:
> > > On Thu, Nov 24, 2016 at 5:44 PM, Martin Jambor  wrote:
> > > >
> > > > ...
> > > >
> > > > 2016-11-24  Martin Jambor  
> > > >
> > > > gcc/
> > > > * passes.def (pass_build_ssa_passes): Add 
> > > > pass_rebuild_cgraph_edges.
> > > >
> > > > gcc/testsuite/
> > > > * g++.dg/pr70965.C: New test.
> > > > ---
> > > >  gcc/passes.def |  1 +
> > > >  gcc/testsuite/g++.dg/pr70965.C | 21 +
> > > >  2 files changed, 22 insertions(+)
> > > >  create mode 100644 gcc/testsuite/g++.dg/pr70965.C
> > > >
> > > > diff --git a/gcc/passes.def b/gcc/passes.def
> > > > index 85a5af0..5faf17f 100644
> > > > --- a/gcc/passes.def
> > > > +++ b/gcc/passes.def
> > > > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
> > > >NEXT_PASS (pass_build_ssa_passes);
> > > >PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
> > > >NEXT_PASS (pass_fixup_cfg);
> > > > +  NEXT_PASS (pass_rebuild_cgraph_edges);
> > > >NEXT_PASS (pass_build_ssa);
> > > >NEXT_PASS (pass_warn_nonnull_compare);
> > > >NEXT_PASS (pass_ubsan);
> > 
> > Actually you want to rebuild at the end of pass_build_ssa_passes passes 
> > queue.
> > This may still trip an ICE if one of passes bellow modify CFG (which 
> > pass_nothorw
> > does)
> > 
> > Path is OK with that change.
> 
> Well, I have already committed the patch as it was.  But given the
> above, I will commit the following to trunk after bootstrapping and
> testing.
> 
> Thanks,
> 
> Martin
> 
> 
> 2016-12-02  Martin Jambor  
> 
>   * passes.def: Move pass_rebuild_cgraph_edges to the end of
>   pass_build_ssa_passes.

Thanks,
Honza

Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.

2016-12-02 Thread Tamar Christina

Ping?

Is there anything else I need to do for the patch or is it OK for trunk?

Thanks,
Tamar

From: Tamar Christina
Sent: Friday, November 25, 2016 12:18:52 PM
To: Joseph Myers
Cc: GCC Patches; Wilco Dijkstra; rguent...@suse.de; l...@redhat.com; Michael 
Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

Hi Joseph,

I have updated the patch with the changes,
w.r.t to the formatting, there are tabs there that seem to be rendered
at 4 spaces wide. In my editor setup at 8 spaces they are correct.

Kind Regards,
Tamar

From: Joseph Myers 
Sent: Thursday, November 24, 2016 6:28:18 PM
To: Tamar Christina
Cc: GCC Patches; Wilco Dijkstra; rguent...@suse.de; l...@redhat.com; Michael 
Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

On Thu, 24 Nov 2016, Tamar Christina wrote:

> @@ -11499,6 +11503,53 @@ to classify.  GCC treats the last argument as 
> type-generic, which
>  means it does not do default promotion from float to double.
>  @end deftypefn
>
> +@deftypefn {Built-in Function} int __builtin_isnan (...)
> +This built-in implements the C99 isnan functionality which checks if
> +the given argument represents a NaN.  The return value of the
> +function will either be a 0 (false) or a 1 (true).
> +On most systems, when an IEEE 754 floating point is used this
> +built-in does not produce a signal when a signaling NaN is used.

"an IEEE 754 floating point" should probably be "an IEEE 754
floating-point type" or similar.

> +GCC treats the argument as type-generic, which means it does
> +not do default promotion from float to double.

I think type names such as float and double should use @code{} in the
manual.

> +the given argument represents an Infinite number.  The return

Infinite should not have a capital letter there.

> +@deftypefn {Built-in Function} int __builtin_iszero (...)
> +This built-in implements the C99 iszero functionality which checks if

This isn't C99, it's TS 18661-1:2014.

> +the given argument represents the number 0.  The return

0 or -0.

> +@deftypefn {Built-in Function} int __builtin_issubnormal (...)
> +This built-in implements the C99 issubnormal functionality which checks if

Again, TS 18661-1.

> +the given argument represents a sub-normal number.  The return

Do not hyphenate subnormal.

> + switch (DECL_FUNCTION_CODE (decl))
> + {
> + case BUILT_IN_SETJMP:
> + lower_builtin_setjmp (gsi);
> + data->cannot_fallthru = false;
> + return;

The indentation in this whole block of code (not all quoted) is wrong.

> +real_inf(&rinf);

Missing space before '('.

> +  emit_tree_cond (&body, dest, done_label,
> +   is_normal(&body, arg, loc), fp_normal);
> +  emit_tree_cond (&body, dest, done_label,
> +   is_zero(&body, arg, loc), fp_zero);
> +  emit_tree_cond (&body, dest, done_label,
> +   is_nan(&body, arg, loc), fp_nan);
> +  emit_tree_cond (&body, dest, done_label,
> +   is_infinity(&body, arg, loc), fp_infinite);

Likewise.

> +   fndecl(&body, arg, loc), t_true);

Likewise.

--
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] PR fortran/78618 -- RANK() should not ICE

2016-12-02 Thread Steve Kargl

On Fri, Dec 02, 2016 at 04:47:19PM +0100, Janus Weil wrote:
> Hi Steve,
> 
> 2016-12-02 2:33 GMT+01:00 Steve Kargl :
> > The attached patch fixes an ICE, a nearby whitespace issue, and
> > removed the ATTRIBUTE_UNUSED tag.  THe change has passed regression
> > testing on x86_64-*-freebsd.  Ok to commit?
> 
> huh, I don't really understand why the argument of RANK is detected to
> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be
> an EXPR_CONSTANT?
> 
> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed
> is the symbol "c" (as expected), but that clearly is not a function,
> so it seems to me that the actual bug here is that a->expr_type is set
> incorrectly ...?
> 

I found that it is the function __convert_s4_s1.   Namely,  we have

character, parameter :: c = char(256,4)
print *, rank(c)

c is kind=1 and char(256,4) is kind 4.  so, we end up with

print *, rank(__convert_s4_s1(...))

As __convert_s4_s1() has neither isym nor esym set, you get a problem.

-- 
Steve

Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)

2016-12-02 Thread Jakub Jelinek

On Fri, Dec 02, 2016 at 05:12:20PM +0100, Uros Bizjak wrote:
> >> This patch:
> >> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two
> >>one_cmplsi2 after reload)
> >> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones)
> >> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it
> >>is for -mbmi only, and adds another *andndi3_doubleword pattern that is
> >>meant to live just from combine till the stv pass, or worse case till
> >>following split1 pass when it is split back into not followed by and;
> >>this change makes it possible to use pandn in stv pass, even without
> >>-mbmi
> >
> > Please use attached (lightly tested) patch to implement point 3)
> > above. The patch splits insn after reload, as is the case with all STV
> > patterns.
> 
> Now attached for real.

Ok, I've checked in following patch (compared to your notes just added
xfail to the pr70322-2.c test scan-assembler), feel free to test your patch
and remove the xfail again.

2016-12-02  Jakub Jelinek  

PR target/70322
* config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle
NOT.
(dimode_scalar_chain::compute_convert_gain): Likewise.
(dimode_scalar_chain::convert_insn): Likewise.
* config/i386/i386.md (*one_cmpldi2_doubleword): New
define_insn_and_split.
(one_cmpl2): Use SWIM1248x iterator instead of SWIM.

* gcc.target/i386/pr70322-1.c: New test.
* gcc.target/i386/pr70322-2.c: New test.
* gcc.target/i386/pr70322-3.c: New test.

--- gcc/config/i386/i386.c.jj   2016-12-02 11:17:40.702995111 +0100
+++ gcc/config/i386/i386.c  2016-12-02 12:01:31.656469089 +0100
@@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx
return false;
   break;
 
+case NOT:
+  break;
+
 case REG:
   return true;
 
@@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx
 
   if ((GET_MODE (XEXP (src, 0)) != DImode
&& !CONST_INT_P (XEXP (src, 0)))
-  || (GET_MODE (XEXP (src, 1)) != DImode
+  || (GET_CODE (src) != NOT
+ && GET_MODE (XEXP (src, 1)) != DImode
  && !CONST_INT_P (XEXP (src, 1
 return false;
 
@@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai
  if (CONST_INT_P (XEXP (src, 1)))
gain -= vector_const_cost (XEXP (src, 1));
}
+  else if (GET_CODE (src) == NOT)
+   gain += ix86_cost->add - COSTS_N_INSNS (1);
   else if (GET_CODE (src) == COMPARE)
{
  /* Assume comparison cost is the same.  */
@@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i
   PUT_MODE (src, V2DImode);
   break;
 
+case NOT:
+  src = XEXP (src, 0);
+  convert_op (&src, insn);
+  subreg = gen_reg_rtx (V2DImode);
+  emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), insn);
+  src = gen_rtx_XOR (V2DImode, src, subreg);
+  break;
+
 case MEM:
   if (!REG_P (dst))
convert_op (&src, insn);
--- gcc/config/i386/i386.md.jj  2016-12-01 23:24:51.663157486 +0100
+++ gcc/config/i386/i386.md 2016-12-02 12:50:27.616829191 +0100
@@ -9312,9 +9312,22 @@
 
 ;; One complement instructions
 
+(define_insn_and_split "*one_cmpldi2_doubleword"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=rm")
+   (not:DI (match_operand:DI 1 "nonimmediate_operand" "0")))]
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
+   && ix86_unary_operator_ok (NOT, DImode, operands)"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (not:SI (match_dup 1)))
+   (set (match_dup 2)
+   (not:SI (match_dup 3)))]
+  "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);")
+
 (define_expand "one_cmpl2"
-  [(set (match_operand:SWIM 0 "nonimmediate_operand")
-   (not:SWIM (match_operand:SWIM 1 "nonimmediate_operand")))]
+  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
+   (not:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")))]
   ""
   "ix86_expand_unary_operator (NOT, mode, operands); DONE;")
 
--- gcc/testsuite/gcc.target/i386/pr70322-1.c.jj2016-12-02 
12:52:47.193051745 +0100
+++ gcc/testsuite/gcc.target/i386/pr70322-1.c   2016-12-02 12:52:24.708338078 
+0100
@@ -0,0 +1,12 @@
+/* PR target/70322 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -msse2 -mstv -mbmi" } */
+/* { dg-final { scan-assembler "pandn" } } */
+
+extern long long z;
+
+void
+foo (long long x, long long y)
+{
+  z = ~x & y;
+}
--- gcc/testsuite/gcc.target/i386/pr70322-2.c.jj2016-12-02 
12:52:50.165013898 +0100
+++ gcc/testsuite/gcc.target/i386/pr70322-2.c   2016-12-02 12:52:39.302152232 
+0100
@@ -0,0 +1,12 @@
+/* PR target/70322 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -msse2 -mstv -mno-bmi" } */
+/* { dg-final { scan-assembler "pandn" { xfail *-*-* } } } */
+
+extern long long z;
+
+void
+foo (long long x, long long y)
+{
+  z = ~x & y;
+}
---

Re: [PATCH] gcc: Fix sysroot relative paths for MinGW

2016-12-02 Thread Jeff Law


On 10/08/2016 04:26 AM, Tadek Kijkowski wrote:

Prevent paths relative to sysroot directory from being transformed to
Windows form with MSYS prefix.

Second attempt:
Moved most changes to x-mingw32. Only thing added to Makefile.in are new
variables to ease overriding in included file.
Disabled overriding standard lib and include paths (with /mingw/lib and
/mingw/include) in mingw32.h when TARGET_SYSTEM_ROOT is defined.
Host fragments (x-foo files) are strongly discouraged.  The developer 
docs suggest they should only be used for makefile dependencies. 
Practice is somewhat looser than that.  This stuff is so ugly that I'd 
really prefer it be buried, so I think it deserves an exception to the 
guidelines.  But if you need further stuff in this space, let's try to 
avoid additional host fragments.


There's several other directories that are not over-ridden, presumably 
because they're not relative to sysroot?





Target s-selftest in gcc fails on MinGW:
/home/tadek/gcc/gcc-build-mingw32-sysroot/./gcc/xgcc
-B/home/tadek/gcc/gcc-build-mingw32-sysroot/./gcc/ -xc -S -c /dev/null
-fself-test
cc1.exe: fatal error: input file 'nul.s' is the same as output file
It's enough to specify any output file, like '-o self-test-result.s' to
fix the issue. Specifying '-o /dev/null' works as well, but I'm not sure
if it's safe on all systems. It's a topic for separate patch.
I think you're supposed to use HOST_BIT_BUCKET instead of /dev/null. 
That can be a follow-up patch.







Manual at "https://gcc.gnu.org/install/configure.html"; does mention that
path specified with|--with-native-system-header-dir is located within
sysroot, but it fails to mention that the same applies to
||--with-local-prefix.



 Changelog: 2016-10-08 Tadek Kijkowski> 

* gcc/Makefile.in, gcc/config/i386/x-mingw32:

|Fix sysroot relative paths for MinGW
* gcc/config/i386/mingw32.h: Disable overriding default library and
include paths when sysroot is defined
I'm going to go ahead and installon the trunk.  Thanks for updating the 
documentation on what those little makefile fragments do.  It'll be 
helpful if anyone ever needs to change them in the future.



Thanks,
Jeff

[PATCH, GCC/testsuite/ARM] XFAIL optional_thumb-1 and optional_thumb-2 testcases

2016-12-02 Thread Thomas Preudhomme


Hi,

The logic to make -mthumb optional for Thumb-only targets was designed to only 
apply when no mode is specified either at compile time or when the toolchain was 
configured (using --with-mode). The dg-skip-if in the testcases optional_thumb-1 
and optional_thumb-2 catch the former case but not the latter. Unfortunately 
there is no dejagnu procedure to check the presence of an option in the 
configure line used for the compiler. This patch marks the two tests as XFAIL 
since that will at least allow to catch a change from PASS to XFAIL, thus making 
the test still useful. It also fixes a codestyle issue in the target triplet of 
the dg-skip-if directives of all the optional_thumb-* testcases.


ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2016-12-01  Thomas Preud'homme  

* gcc.target/arm/optional_thumb-3.c: Fix codestyle issue in target
triplet of dg-skip-if.
* gcc.target/arm/optional_thumb-1.c: Likewise and add xfail.
* gcc.target/arm/optional_thumb-2.c: Likewise.

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
index 23df62887ba4aaa1d8717a34ecda9a40246f0552..27787bb8a403534abf708be02a80bdcf42638958 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
-/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
+/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
 /* { dg-options "-march=armv6-m" } */
+/* { dg-xfail-if "fails if GCC configured with --with-mode=arm" { *-*-* } } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
 
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
index 4bd53a45eca97e62dd3b86d5a1a66c5ca21e7aad..5a8e5e8cfccf8d9175432ef6b5c07cb0b5f7b0bf 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
-/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
+/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
 /* { dg-options "-mcpu=cortex-m4" } */
+/* { dg-xfail-if "fails if GCC configured with --with-mode=arm" { *-*-* } } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
 
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
index f1fd5c8840b191e600c20a7817c611bb9bb645df..735c2774738c8d84deb14a9f082c34bf622b2d5d 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
-/* { dg-skip-if "-mthumb given" { *-*-*} { "-mthumb" } } */
+/* { dg-skip-if "-mthumb given" { *-*-* } { "-mthumb" } } */
 /* { dg-options "-marm" } */
 /* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-*} 0 } */

Re: [build] Handle gas/gld --compress-debug-sections=type

2016-12-02 Thread Matthias Klose

>From my point of view this should be backported to the active branches.
Building GCC 5 and GCC 6 with binutils >=2.26 now results in

$ gcc -c -gz foo.c
gcc: error: -gz is not supported in this configuration

building these GCC version with binutils 2.25 recognizes this option.

On 30.05.2016 13:32, Rainer Orth wrote:
> * When I removed the default in the gcc_cv_ld_compress test, the outcome
>   always was 0, irrespective of the linker tested.  Before, it would
>   almost always have been 1 if testing GNU ld.  It turns out that in
>   this (and numerous other) cases the nesting of tests using ld_date was
>   wrong.  I believe most if not all of those ld_date tests can go, being
>   only relevant for truly prehistoric versions of GNU ld not in use
>   anymore.  I'll probably submit a followup to remove them, simplifying
>   several ld tests.
> 
> Bootstrapped without regressions on i386-pc-solaris2.1[02] with various
> as/ld combinations and checking that the tests yielded the expected
> results:
> 
>   gcc_cv_as_compress_debug/gcc_cv_ld_compress_debug
> as/ld 2.100/0
> as/ld 2.120/3
> gas 2.26/ld 2.10  2/0
> gas 2.26/ld 2.12  2/3
> gas 2.26/gld 2.26 2/3

the GNU case now reads

  if test "$ld_vers_major" -lt 2 \
 || test "$ld_vers_major" -eq 2 -a "$ld_vers_minor" -lt 21; then
gcc_cv_ld_compress_debug=0
  elif test "$ld_vers_major" -eq 2 -a "$ld_vers_minor" -lt 26; then
gcc_cv_ld_compress_debug=1
  else
gcc_cv_ld_compress_debug=3
gcc_cv_ld_compress_debug_option="--compress-debug-sections"
  fi
  if test $ld_is_gold = yes; then
gcc_cv_ld_compress_debug=2
gcc_cv_ld_compress_debug_option="--compress-debug-sections"
  fi

so you end up with different values depending on the linker default. Is this
intended?

LINK_COMPRESS_DEBUG_SPEC in gcc.c is defined in terms of the linker used at
build time, so currently you get the wrong specs when using the non-default
linker when selecting the linker at runtime using -fuse-ld=...

Matthias

Re: [PATCH] PR fortran/78618 -- RANK() should not ICE

2016-12-02 Thread Janus Weil

2016-12-02 17:30 GMT+01:00 Steve Kargl :
> On Fri, Dec 02, 2016 at 04:47:19PM +0100, Janus Weil wrote:
>> Hi Steve,
>>
>> 2016-12-02 2:33 GMT+01:00 Steve Kargl :
>> > The attached patch fixes an ICE, a nearby whitespace issue, and
>> > removed the ATTRIBUTE_UNUSED tag.  THe change has passed regression
>> > testing on x86_64-*-freebsd.  Ok to commit?
>>
>> huh, I don't really understand why the argument of RANK is detected to
>> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be
>> an EXPR_CONSTANT?
>>
>> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed
>> is the symbol "c" (as expected), but that clearly is not a function,
>> so it seems to me that the actual bug here is that a->expr_type is set
>> incorrectly ...?
>
> I found that it is the function __convert_s4_s1.

That's strange. If we see different things here, maybe we are running
into some kind of undefined behavior (possibly related to
gfc_bad_expr?). Anyway, after some more debugging I came to the
conclusion that what actually fails is the error propagation, which
seems to be broken in gfc_check_assign and can be fixed like this:


Index: gcc/fortran/expr.c
===
--- gcc/fortran/expr.c(revision 243194)
+++ gcc/fortran/expr.c(working copy)
@@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval
   if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER)
 {
   if (lvalue->ts.kind != rvalue->ts.kind && allow_convert)
-gfc_convert_chartype (rvalue, &lvalue->ts);
-
-  return true;
+return gfc_convert_chartype (rvalue, &lvalue->ts);
+  else
+return true;
 }

   if (!allow_convert)


This also avoids the ICE and I think is the proper way to fix this ...

Cheers,
Janus

Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation

2016-12-02 Thread James Greenhalgh

On Wed, Nov 30, 2016 at 02:07:58PM +, Kyrill Tkachov wrote:
> 
> On 29/11/16 20:29, Segher Boessenkool wrote:
> >Hi James, Kyrill,
> >
> >On Tue, Nov 29, 2016 at 10:57:33AM +, James Greenhalgh wrote:
> >>>+static sbitmap
> >>>+aarch64_components_for_bb (basic_block bb)
> >>>+{
> >>>+  bitmap in = DF_LIVE_IN (bb);
> >>>+  bitmap gen = &DF_LIVE_BB_INFO (bb)->gen;
> >>>+  bitmap kill = &DF_LIVE_BB_INFO (bb)->kill;
> >>>+
> >>>+  sbitmap components = sbitmap_alloc (V31_REGNUM + 1);
> >>>+  bitmap_clear (components);
> >>>+
> >>>+  /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets.  */
> >>>+  for (unsigned regno = R0_REGNUM; regno <= V31_REGNUM; regno++)
> >>The use of R0_REGNUM and V31_REGNUM scare me a little bit, as we're 
> >>hardcoding
> >>where the end of the register file is (does this, for example, fall apart
> >>with the SVE work that was recently posted). Something like a
> >>LAST_HARDREG_NUM might work?
> >Components and registers aren't the same thing (you can have components
> >for things that aren't just a register save, e.g. the frame setup, stack
> >alignment, save of some non-GPR via a GPR, PIC register setup, etc.)
> >The loop here should really only cover the non-volatile registers, and
> >there should be some translation from register number to component number
> >(it of course is convenient to have a 1-1 translation for the GPRs and
> >floating point registers).  For rs6000 many things in the backend already
> >use non-symbolic numbers for the FPRs and GPRs, so that is easier there.
> 
> Anyway, here's the patch with James's comments implemented.
> I've introduced LAST_SAVED_REGNUM which is used to delimit the registers
> considered for shrink-wrapping.
> 
> aarch64_process_components is introduced and used to implement the
> emit_prologue_components and emit_epilogue_components functions in a single
> place.

OK.

> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Thanks,
> Kyrill
> 
> 2016-11-30  Kyrylo Tkachov  
> 
> * config/aarch64/aarch64.h (machine_function): Add
> reg_is_wrapped_separately field.
> * config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Define new constant.
> * config/aarch64/aarch64.c (emit_set_insn): Change return type to
> rtx_insn *.
> (aarch64_save_callee_saves): Don't save registers that are wrapped
> separately.
> (aarch64_restore_callee_saves): Don't restore registers that are
> wrapped separately.
> (offset_9bit_signed_unscaled_p, offset_12bit_unsigned_scaled_p,
> aarch64_offset_7bit_signed_scaled_p): Move earlier in the file.
> (aarch64_get_separate_components): New function.
> (aarch64_get_next_set_bit): Likewise.
> (aarch64_components_for_bb): Likewise.
> (aarch64_disqualify_components): Likewise.
> (aarch64_emit_prologue_components): Likewise.
> (aarch64_emit_epilogue_components): Likewise.
> (aarch64_set_handled_components): Likewise.
> (aarch64_process_components): Likewise.
> (TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS,
> TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB,
> TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS,
> TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS,
> TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS,
> TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Define.
> >>>+static void
> >>>+aarch64_disqualify_components (sbitmap, edge, sbitmap, bool)
> >>>+{
> >>>+}
> >>Is there no default "do nothing" hook for this?
> >I can make the shrink-wrap code do nothing here if this hook isn't
> >defined, if you want?
> 
> I don't mind either way.
> If you do it I'll then remove the empty implementation in aarch64.

If there is no empty default, this is fine.

Thanks,
James

Re: [PATCH] PR fortran/78618 -- RANK() should not ICE

2016-12-02 Thread Steve Kargl

On Fri, Dec 02, 2016 at 06:00:48PM +0100, Janus Weil wrote:
> 2016-12-02 17:30 GMT+01:00 Steve Kargl :
> > On Fri, Dec 02, 2016 at 04:47:19PM +0100, Janus Weil wrote:
> >> Hi Steve,
> >>
> >> 2016-12-02 2:33 GMT+01:00 Steve Kargl :
> >> > The attached patch fixes an ICE, a nearby whitespace issue, and
> >> > removed the ATTRIBUTE_UNUSED tag.  THe change has passed regression
> >> > testing on x86_64-*-freebsd.  Ok to commit?
> >>
> >> huh, I don't really understand why the argument of RANK is detected to
> >> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be
> >> an EXPR_CONSTANT?
> >>
> >> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed
> >> is the symbol "c" (as expected), but that clearly is not a function,
> >> so it seems to me that the actual bug here is that a->expr_type is set
> >> incorrectly ...?
> >
> > I found that it is the function __convert_s4_s1.
> 
> That's strange. If we see different things here, maybe we are running
> into some kind of undefined behavior (possibly related to
> gfc_bad_expr?). Anyway, after some more debugging I came to the
> conclusion that what actually fails is the error propagation, which
> seems to be broken in gfc_check_assign and can be fixed like this:
> 
> 
> Index: gcc/fortran/expr.c
> ===
> --- gcc/fortran/expr.c(revision 243194)
> +++ gcc/fortran/expr.c(working copy)
> @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval
>if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER)
>  {
>if (lvalue->ts.kind != rvalue->ts.kind && allow_convert)
> -gfc_convert_chartype (rvalue, &lvalue->ts);
> -
> -  return true;
> +return gfc_convert_chartype (rvalue, &lvalue->ts);
> +  else
> +return true;
>  }
> 
>if (!allow_convert)
> 
> 
> This also avoids the ICE and I think is the proper way to fix this ...
> 

When developing the patch, I fixed/avoided the ICE, first.  Then,
I tried Gerhard's second testcase without the PARAMETER attribute.
An error message is emitted, so I went hunting for why PARAMETER
suppressed the error message.  This led to the second part of the
patch of changing gfc_error to gfc_error_now.  Perhaps, forcing
the gfc_error_now prevents gfortran from inserting the 
__convert_s4_s1 call.

In any event, if your patch causes an error message to be emitted,
then I think that your patch is better than the one I proposed.  
Feel free to commit your patch.

-- 
Steve

Re: [PATCH] avoid calling alloca(0)

2016-12-02 Thread Eric Gallager

On 12/2/16, Bernd Edlinger  wrote:
> On 12/01/16 19:39, Jeff Law wrote:
>> On 11/30/2016 09:09 PM, Martin Sebor wrote:
 What I think this tells us is that we're not at a place where we're
 clean.  But we can incrementally get there.  The warning is only
 catching a fairly small subset of the cases AFAICT.  That's not unusual
 and analyzing why it didn't trigger on those cases might be useful as
 well.
>>>
>>> The warning has no smarts.  It relies on constant propagation and
>>> won't find a call unless it sees it's being made with a constant
>>> zero.  Looking at the top two on the list the calls are in extern
>>> functions not called from the same source file, so it probably just
>>> doesn't see that the functions are being called from another file
>>> with a zero.  Building GCC with LTO might perhaps help.
>> Right.  This is consistent with the limitations of other similar
>> warnings such as null pointer dereferences.
>>
>>>
 So where does this leave us for gcc-7?  I'm wondering if we drop the
 warning in, but not enable it by default anywhere.  We fix the cases we
 can (such as reg-stack,c tree-ssa-threadedge.c, maybe others) before
 stage3 closes, and shoot for the rest in gcc-8, including improvign the
 warning (if there's something we can clearly improve), and enabling the
 warning in -Wall or -Wextra.
>>>
>>> I'm fine with deferring the GCC fixes and working on the cleanup
>>> over time but I don't think that needs to gate enabling the option
>>> with -Wextra.  The warnings can be suppressed or prevented from
>>> causing errors during a GCC build either via a command line option
>>> or by pragma in the code.  AFAICT, from the other warnings I see
>>> go by, this is what has been done for -Wno-implicit-fallthrough
>>> while those warnings are being cleaned up.  Why not take the same
>>> approach here?
>> The difference vs implicit fallthrough is that new instances of implicit
>> fallthrus aren't likely to be exposed by changes in IL that occur due to
>> transformations in the optimizer pipeline.
>>
>> Given the number of runtime triggers vs warnings, we know there's many
>> instances of passing 0 to the allocators that we're not diagnosing. I
>> can easily see differences in the early IL (such as those due to
>> BRANCH_COST differing for targets) exposing/hiding cases where 0 flows
>> into the allocator argument.  Similarly for changes in inlining
>> decisions, jump threading, etc for profiled bootstraps.  I'd like to
>> avoid playing wack-a-mole right now.
>>
>> So I'm being a bit more conservative here.  Maybe it'd be appropriate in
>> Wextra since that's not enabled by default for GCC builds.
>>
>
> actually it is enabled, by -W -Wall ...
>


Don't the gcc docs say that -Wextra is the preferred name for -W? Why
is gcc still using the deprecated name for the flag when building
itself? Seems like it'd help to avoid this confusion if the flags read
-Wextra instead of -W...

Re: Change default level for -Wimplicit-fallthrough

2016-12-02 Thread Jeff Law


On 11/03/2016 05:51 AM, Bernd Schmidt wrote:

I'm concerned about the number of false positives for this warning, and
judging by previous discussions, I'm not alone in this. This patch
limits it to level 1 (any comment before the case label disables the
warning) for cases where the user specified no explicit level. It'll
still generate enough noise that people will be aware of it and can
choose whether to use a higher level or not.

Bootstrapped and tested on x86_64-linux. Ok?
I think we should wait for the distro builds to fire up and see how the 
warning impacts them before making a decision on this patch.


jeff

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-02 Thread Andrew Senkevich

2016-11-11 22:14 GMT+03:00 Uros Bizjak :
> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
>  wrote:
>> 2016-11-11 20:56 GMT+03:00 Uros Bizjak :
>>> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
 On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
  wrote:
> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
>> Some quick remarks:
>>
>> +(define_insn "kmovb"
>> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>> + (unspec:QI
>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>> +  "@
>> +   kmovb\t{%k1, %0|%0, %k1}
>> +   kmovb\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "QI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>> +
>> +(define_insn "kmovd"
>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>> + (unspec:SI
>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>> +  "@
>> +   kmovd\t{%k1, %0|%0, %k1}
>> +   kmovd\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "SI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>> +
>> +(define_insn "kmovq"
>> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>> + (unspec:DI
>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>> +  "@
>> +   kmovq\t{%k1, %0|%0, %k1}
>> +   kmovq\t{%1, %0|%0, %1}
>> +   kmovq\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "DI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>>
>> - kmovd (and existing kmovw) should be using register_operand for
>> opreand 0. In this case, there is no need for MEM_P checks at all.
>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>> - please put these definitions above corresponding *mov??_internal 
>> patterns.
>
> Do you mean put below *mov??_internal patterns? Attached corrected such 
> way.

 No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
 etc. It doesn't matter if they are above or below their respective
 *mov??_internal patterns, as long as they are positioned in some
 consistent way. IOW, new patterns shouldn't be grouped together, as is
 the case with your patch.
>>>
>>> +(define_insn "kmovb"
>>> +  [(set (match_operand:QI 0 "register_operand" "=k,k")
>>> +(unspec:QI
>>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "TARGET_AVX512DQ && !MEM_P (operands[1])"
>>>
>>> There is no need for !MEM_P, this will prevent memory operand, which
>>> is allowed by constraint "m".
>>>
>>> +(define_insn "kmovq"
>>> +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
>>> +(unspec:DI
>>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>> +  UNSPEC_KMOV))]
>>> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>>>
>>> Operand 0 should have "nonimmediate_operand" predicate. And here you
>>> need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
>>> mem->mem moves.
>>
>> Changed according your comments and attached.
>
> Still not good.
>
> +(define_insn "kmovd"
> +  [(set (match_operand:SI 0 "register_operand" "=k,k")
> +(unspec:SI
> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>
> Remove !MEM_P in the above pattern.
>
>  (define_insn "kmovw"
> -  [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
> +  [(set (match_operand:HI 0 "register_operand" "=k,k")
>  (unspec:HI
>[(match_operand:HI 1 "nonimmediate_operand" "r,km")]
>UNSPEC_KMOV))]
> -  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
> +  "TARGET_AVX512F && !MEM_P (operands[1])"
>
> Also remove !MEM_P here.
>
> +(define_insn "kadd"
> +  [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k")
> +(plus:SWI1248x
> +  (not:SWI1248x
> +(match_operand:SWI1248x 1 "register_operand" "r,0,k"))
> +  (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "TARGET_AVX512F"
> +{
> +  switch (which_alternative)
> +{
> +case 0:
> +  return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}";
> +case 1:
> +  return "#";
> +case 2:
> +  if (TARGET_AVX512BW && mode == DImode)
> +return "kaddq\t{%2, %1, %0|%0, %1, %2}";
> +  else if (TARGET_AVX512BW && mode == SImode)
> +return "kaddd\t{%2, %1, %0|%0, %1, %2}";
> +  else if (TARGET_AVX512DQ && mode == QImode)
> +return "kaddb\t{%2, %1, %0|%0, %1, %2}";
> +  else
> +return "kaddw\t{%2, %1, %0|%0, %1, %2}";
> +
>
> The above pattern i

Re: Avoid alloca(0) when temporarily propagating operands during threading

2016-12-02 Thread Jakub Jelinek

On Thu, Dec 01, 2016 at 11:43:19PM -0700, Jeff Law wrote:
> 
> Martin's alloca work flagged this code as problematical.  Essentially if we
> had a statement with no operands and the statement was not in the hash
> table, then we could end up performing alloca (0), which is inadvisable.

I still don't understand why it is inadvisable.  alloca(0) is not undefined
behavior.  It can return NULL, or non-unique pointer, or a unique pointer,
and/or cause freeing of already left alloca blocks (like any other alloca
call).
None of that is a problem here.  If num is 0, then copy is just set and
never used.
I expect most if not all gcc uses of alloca where the count can be 0 are
like that.

Jakub

Re: Avoid alloca(0) when temporarily propagating operands during threading

2016-12-02 Thread Jeff Law


On 12/02/2016 10:58 AM, Jakub Jelinek wrote:

On Thu, Dec 01, 2016 at 11:43:19PM -0700, Jeff Law wrote:


Martin's alloca work flagged this code as problematical.  Essentially if we
had a statement with no operands and the statement was not in the hash
table, then we could end up performing alloca (0), which is inadvisable.


I still don't understand why it is inadvisable.  alloca(0) is not undefined
behavior.  It can return NULL, or non-unique pointer, or a unique pointer,
and/or cause freeing of already left alloca blocks (like any other alloca
call).
None of that is a problem here.  If num is 0, then copy is just set and
never used.
I expect most if not all gcc uses of alloca where the count can be 0 are
like that.
It won't cause any problems in this and probably most instances, but 
leaving the code in its prior state is simply wrong from a maintenance 
standpoint.


I'd much rather have the code explicitly and safely handle the zero 
operands case so that if someone makes a change later they don't have to 
worry about whether or not they're accessing memory which was never 
allocated.


Additionally, it removes a false positive from the warning, thus making 
less noise.


It's not unlike the strictly unnecessary initializations we do to shut 
up -Wuninitialized.


Jeff

Re: [PATCH] PR fortran/78618 -- RANK() should not ICE

2016-12-02 Thread Janus Weil

2016-12-02 18:13 GMT+01:00 Steve Kargl :
>> >> > The attached patch fixes an ICE, a nearby whitespace issue, and
>> >> > removed the ATTRIBUTE_UNUSED tag.  THe change has passed regression
>> >> > testing on x86_64-*-freebsd.  Ok to commit?
>> >>
>> >> huh, I don't really understand why the argument of RANK is detected to
>> >> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be
>> >> an EXPR_CONSTANT?
>> >>
>> >> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed
>> >> is the symbol "c" (as expected), but that clearly is not a function,
>> >> so it seems to me that the actual bug here is that a->expr_type is set
>> >> incorrectly ...?
>> >
>> > I found that it is the function __convert_s4_s1.
>>
>> That's strange. If we see different things here, maybe we are running
>> into some kind of undefined behavior (possibly related to
>> gfc_bad_expr?). Anyway, after some more debugging I came to the
>> conclusion that what actually fails is the error propagation, which
>> seems to be broken in gfc_check_assign and can be fixed like this:
>>
>>
>> Index: gcc/fortran/expr.c
>> ===
>> --- gcc/fortran/expr.c(revision 243194)
>> +++ gcc/fortran/expr.c(working copy)
>> @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval
>>if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER)
>>  {
>>if (lvalue->ts.kind != rvalue->ts.kind && allow_convert)
>> -gfc_convert_chartype (rvalue, &lvalue->ts);
>> -
>> -  return true;
>> +return gfc_convert_chartype (rvalue, &lvalue->ts);
>> +  else
>> +return true;
>>  }
>>
>>if (!allow_convert)
>>
>>
>> This also avoids the ICE and I think is the proper way to fix this ...
>>
>
> When developing the patch, I fixed/avoided the ICE, first.  Then,
> I tried Gerhard's second testcase without the PARAMETER attribute.
> An error message is emitted, so I went hunting for why PARAMETER
> suppressed the error message.  This led to the second part of the
> patch of changing gfc_error to gfc_error_now.

I think the gfc_error_now should not be necessary with my fix, but
removing the ATTRIBUTE_UNUSED is certainly a good idea.


> In any event, if your patch causes an error message to be emitted,
> then I think that your patch is better than the one I proposed.
> Feel free to commit your patch.

I am now regtesting the attached version and, if successful, will commit.

Cheers,
Janus
Index: gcc/fortran/check.c
===
--- gcc/fortran/check.c (revision 243194)
+++ gcc/fortran/check.c (working copy)
@@ -3667,7 +3667,7 @@ gfc_check_range (gfc_expr *x)
 
 
 bool
-gfc_check_rank (gfc_expr *a ATTRIBUTE_UNUSED)
+gfc_check_rank (gfc_expr *a)
 {
   /* Any data object is allowed; a "data object" is a "constant (4.1.3),
  variable (6), or subobject of a constant (2.4.3.2.3)" (F2008, 1.3.45).  */
Index: gcc/fortran/expr.c
===
--- gcc/fortran/expr.c  (revision 243194)
+++ gcc/fortran/expr.c  (working copy)
@@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval
   if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER)
 {
   if (lvalue->ts.kind != rvalue->ts.kind && allow_convert)
-   gfc_convert_chartype (rvalue, &lvalue->ts);
-
-  return true;
+   return gfc_convert_chartype (rvalue, &lvalue->ts);
+  else
+   return true;
 }
 
   if (!allow_convert)

Re: Avoid alloca(0) when temporarily propagating operands during threading

2016-12-02 Thread Jakub Jelinek

On Fri, Dec 02, 2016 at 11:02:33AM -0700, Jeff Law wrote:
> It won't cause any problems in this and probably most instances, but leaving
> the code in its prior state is simply wrong from a maintenance standpoint.
> 
> I'd much rather have the code explicitly and safely handle the zero operands
> case so that if someone makes a change later they don't have to worry about
> whether or not they're accessing memory which was never allocated.
> 
> Additionally, it removes a false positive from the warning, thus making less
> noise.
> 
> It's not unlike the strictly unnecessary initializations we do to shut up
> -Wuninitialized.

But -Wuninitialized also found tons of real-world bugs.  Do we have a single
real-world example where such a warning would actually be useful (as in,
there would be an actual bug)?  Otherwise we are adding workarounds for a
warning that just forces people to add those workarounds, but doesn't
improve code in the wild in any way.

Jakub

[PATCH 8a/9] Introduce class function_reader (v7)

2016-12-02 Thread David Malcolm

On Fri, 2016-12-02 at 15:41 +0100, Bernd Schmidt wrote:
> On 12/02/2016 03:00 AM, David Malcolm wrote:
> > Changed in v6:
> > - split out dump-reading selftests into followup patches
> >   (target-independent, and target-specific)
> > - removes unneeded headers from read-rtl-function.c
> > - numerous other cleanups identified in review
> 
> Ok, starting to look very close to done.

Thanks for the reviews so far.

> It appears I accidentally made you change element->directive. It
> seems
> like a good change but all I wanted to point out was a plural where I
> would have expected a singular.

(nods)

> > 
> > +  /* Lookup param by name.  */
> > +  tree t_param = parse_mem_expr (name);
> > +  // TODO: what if not found?
> 
> Error out?

Currently, function_reader::parse_mem_expr can't fail; it creates a
VAR_DECL for anything it doesn't recognize.

But here, we should be looking specifically for a PARAM_DECL.

I've moved the find-param-by-name code out from
function_reader::parse_mem_expr into a new function, and call that
instead, and error out if it isn't found from that.

> > +/* Implementation of rtx_reader::handle_unknown_directive.
> > +
> > +   Require a top-level "function" directive, as emitted by
> > +   print_rtx_function, and parse it.  */
> 
> Should document START_LOC and NAME.

Done.

> > +  /* Do we need this to force cgraphunit.c to output the function?
> > */
> 
> Well, what happens if you don't do this? Seems reasonable anyway, so
> maybe lose the comment.

The two flag assignments don't seem to be needed; I think this is due
to adding:

  if (node->native_rtl_p ())
node->force_output = 1;

to cgraph_node::finalize_function in patch 9.

Should I lose them (and the comment)?

> > +/* Parse rtx instructions by calling read_rtx_code, calling
> > +   set_first_insn and set_last_insn as appropriate, and
> > +   adding the insn to the insn chain.
> > +   Consume the trailing ')'.  */
> > +
> > +rtx_insn *
> > +function_reader::parse_insn (file_location start_loc, const char
> > *name)
> 
> Once again, please document args - make another pass over all the
> functions to make sure.

Done.  In doing so, I noticed that the args to read_rtl_function_body
were unnecessarily complicated, so I simplified it to just take a path.
I also made some simplifications to the various fixup_source_location
functions.

> > +
> > +/* Parse operand IDX of X, of code 'i' or 'n'.
> 
> "... as specified by FORMAT_CHAR", presumably.

Done.

> > +  if (0 == strncmp (desc, "orig:", 5))
> > +   {
> > + expect_original_regno = true;
> > + desc_start += 5;
> > + /* Skip to any whitespace following the integer.  */
> > + const char *space = strchr (desc_start, ' ');
> > + if (space)
> > +   desc_start = space + 1;
> > +   }
> > +  /* Any remaining text may be the REG_EXPR.  Alternatively we
> > have
> > +no REG_ATTRS, and instead we have ORIGINAL_REGNO.  */
> > +  if (ISDIGIT (*desc_start))
> > +   {
> > + /* Assume we have ORIGINAL_REGNO.  */
> > + ORIGINAL_REGNO (x) = atoi (desc_start);
> > +   }
> 
> This looks like an issue. You'll want to have ORIGINAL_REGNOs printed
> and parsed like other regnos, i.e. with %number for pseudos. Can be
> done
> as a followup if it's involved.

This is involved; let's do this as a followup.

> > +  /* Verify lookup of hard registers.  */
> > +#ifdef GCC_AARCH64_H
> > +  ASSERT_EQ (0, lookup_reg_by_dump_name ("x0"));
> > +  ASSERT_EQ (1, lookup_reg_by_dump_name ("x1"));
> > +#endif
> > +#ifdef I386_OPTS_H
> > +  ASSERT_EQ (0, lookup_reg_by_dump_name ("ax"));
> > +  ASSERT_EQ (1, lookup_reg_by_dump_name ("dx"));
> > +#endif
> 
> Please just remove this for now; not important enough to break the
> rule
> about target-specific bits in generic code.

Removed for now.

> > @@ -1103,13 +1244,39 @@ rtx_reader::read_rtx_code (const char
> > *code_name)
> >rtx value;   /* Value of this node.  */
> >  };
> > 
> > +  one_time_initialization ();
> 
> I still don't understand why this isn't in the rtx_reader
> constructor?
> Seems pointless to call this each time we want to parse an rtx.

Aha; that makes much more sense.

Done.

> > +
> > +  /* Use the format_ptr to parse the various operands of this rtx.
> > + read_rtx_operand is a vfunc, allowing the parser to vary
> > between
> > + parsing .md files and parsing .rtl dumps; the function_reader
> > subclass
> > + overrides the defaults when loading RTL dumps, to handle the
> > + various extra attributes emitted by print_rtx.  */
> >for (int idx = 0; format_ptr[idx] != 0; idx++)
> > -read_rtx_operand (return_rtx, idx);
> > +return_rtx = read_rtx_operand (return_rtx, idx);
> > +
> > +  /* Call a vfunc to handle the various additional information
> > that
> > + print-rtl.c can write after the regular fields; does nothing
> > when
> > + parsing .md files.  */
> > +  handle_any_trailing_information (return_rtx);
> 
> I still think just drop the bulk o

Re: Avoid alloca(0) when temporarily propagating operands during threading

2016-12-02 Thread Jeff Law


On 12/02/2016 11:11 AM, Jakub Jelinek wrote:

On Fri, Dec 02, 2016 at 11:02:33AM -0700, Jeff Law wrote:

It won't cause any problems in this and probably most instances, but leaving
the code in its prior state is simply wrong from a maintenance standpoint.

I'd much rather have the code explicitly and safely handle the zero operands
case so that if someone makes a change later they don't have to worry about
whether or not they're accessing memory which was never allocated.

Additionally, it removes a false positive from the warning, thus making less
noise.

It's not unlike the strictly unnecessary initializations we do to shut up
-Wuninitialized.


But -Wuninitialized also found tons of real-world bugs.  Do we have a single
real-world example where such a warning would actually be useful (as in,
there would be an actual bug)?  Otherwise we are adding workarounds for a
warning that just forces people to add those workarounds, but doesn't
improve code in the wild in any way.
Have you looked in depth at the lto.c issue it flagged?  I can't prove 
that one is safe.


And more generally, an under-sized allocation is a security risk.  We 
should continue to try and identify those and warn for them.


jeff

Re: Avoid alloca(0) when temporarily propagating operands during threading

2016-12-02 Thread Jakub Jelinek

On Fri, Dec 02, 2016 at 11:13:29AM -0700, Jeff Law wrote:
> >But -Wuninitialized also found tons of real-world bugs.  Do we have a single
> >real-world example where such a warning would actually be useful (as in,
> >there would be an actual bug)?  Otherwise we are adding workarounds for a
> >warning that just forces people to add those workarounds, but doesn't
> >improve code in the wild in any way.
> Have you looked in depth at the lto.c issue it flagged?  I can't prove that
> one is safe.

If you mean the
  tree *map = XALLOCAVEC (tree, 2 * len);
call, then I strongly doubt it can actually ever be called with len == 0.
There is
  tree_scc *scc
= (tree_scc *) alloca (sizeof (tree_scc) + (len - 1) * sizeof (tree));
a few lines above this and len is unsigned int, therefore for len 0 this
will try to (on 64-bit host) alloca (32 + 0x * 4UL), i.e. alloca
(0x4001cUL) and I'm pretty sure on most hosts alloca of 16GB won't
really work.

Even if this succeeded for whatever reason, what problem do you see with
map = alloca (0)?

> And more generally, an under-sized allocation is a security risk.  We should

Sure.  But I really don't see 0 as being special in any way.  We do
support zero sized arrays and IMNSHO we want to support 0 sized alloca, it
is very much the same thing.  We also do support VLAs with 0 size.

Jakub

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-02 Thread Uros Bizjak

On Fri, Dec 2, 2016 at 6:44 PM, Andrew Senkevich
 wrote:
> 2016-11-11 22:14 GMT+03:00 Uros Bizjak :
>> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
>>  wrote:
>>> 2016-11-11 20:56 GMT+03:00 Uros Bizjak :
 On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>  wrote:
>> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
>>> Some quick remarks:
>>>
>>> +(define_insn "kmovb"
>>> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>>> + (unspec:QI
>>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>>> +  "@
>>> +   kmovb\t{%k1, %0|%0, %k1}
>>> +   kmovb\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "QI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>> +
>>> +(define_insn "kmovd"
>>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>>> + (unspec:SI
>>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>> +  "@
>>> +   kmovd\t{%k1, %0|%0, %k1}
>>> +   kmovd\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "SI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>> +
>>> +(define_insn "kmovq"
>>> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>>> + (unspec:DI
>>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>> +  "@
>>> +   kmovq\t{%k1, %0|%0, %k1}
>>> +   kmovq\t{%1, %0|%0, %1}
>>> +   kmovq\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "DI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>>
>>> - kmovd (and existing kmovw) should be using register_operand for
>>> opreand 0. In this case, there is no need for MEM_P checks at all.
>>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>>> - please put these definitions above corresponding *mov??_internal 
>>> patterns.
>>
>> Do you mean put below *mov??_internal patterns? Attached corrected such 
>> way.
>
> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
> etc. It doesn't matter if they are above or below their respective
> *mov??_internal patterns, as long as they are positioned in some
> consistent way. IOW, new patterns shouldn't be grouped together, as is
> the case with your patch.

 +(define_insn "kmovb"
 +  [(set (match_operand:QI 0 "register_operand" "=k,k")
 +(unspec:QI
 +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
 +  UNSPEC_KMOV))]
 +  "TARGET_AVX512DQ && !MEM_P (operands[1])"

 There is no need for !MEM_P, this will prevent memory operand, which
 is allowed by constraint "m".

 +(define_insn "kmovq"
 +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
 +(unspec:DI
 +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
 +  UNSPEC_KMOV))]
 +  "TARGET_AVX512BW && !MEM_P (operands[1])"

 Operand 0 should have "nonimmediate_operand" predicate. And here you
 need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
 mem->mem moves.
>>>
>>> Changed according your comments and attached.
>>
>> Still not good.
>>
>> +(define_insn "kmovd"
>> +  [(set (match_operand:SI 0 "register_operand" "=k,k")
>> +(unspec:SI
>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>>
>> Remove !MEM_P in the above pattern.
>>
>>  (define_insn "kmovw"
>> -  [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
>> +  [(set (match_operand:HI 0 "register_operand" "=k,k")
>>  (unspec:HI
>>[(match_operand:HI 1 "nonimmediate_operand" "r,km")]
>>UNSPEC_KMOV))]
>> -  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
>> +  "TARGET_AVX512F && !MEM_P (operands[1])"
>>
>> Also remove !MEM_P here.
>>
>> +(define_insn "kadd"
>> +  [(set (match_operand:SWI1248x 0 "register_operand" "=r,&r,!k")
>> +(plus:SWI1248x
>> +  (not:SWI1248x
>> +(match_operand:SWI1248x 1 "register_operand" "r,0,k"))
>> +  (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
>> +   (clobber (reg:CC FLAGS_REG))]
>> +  "TARGET_AVX512F"
>> +{
>> +  switch (which_alternative)
>> +{
>> +case 0:
>> +  return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}";
>> +case 1:
>> +  return "#";
>> +case 2:
>> +  if (TARGET_AVX512BW && mode == DImode)
>> +return "kaddq\t{%2, %1, %0|%0, %1, %2}";
>> +  else if (TARGET_AVX512BW && mode == SImode)
>> +return "kaddd\t{%2, %1, %0|%0, %1, %2}";
>>

Re: [PATCH] PR fortran/78618 -- RANK() should not ICE

2016-12-02 Thread Janus Weil

2016-12-02 19:06 GMT+01:00 Janus Weil :
> 2016-12-02 18:13 GMT+01:00 Steve Kargl :
>>> >> > The attached patch fixes an ICE, a nearby whitespace issue, and
>>> >> > removed the ATTRIBUTE_UNUSED tag.  THe change has passed regression
>>> >> > testing on x86_64-*-freebsd.  Ok to commit?
>>> >>
>>> >> huh, I don't really understand why the argument of RANK is detected to
>>> >> be an EXPR_FUNCTION for the test case at hand. Shouldn't it rather be
>>> >> an EXPR_CONSTANT?
>>> >>
>>> >> Some debugging in gfc_check_rank shows that a->symtree->n.sym indeed
>>> >> is the symbol "c" (as expected), but that clearly is not a function,
>>> >> so it seems to me that the actual bug here is that a->expr_type is set
>>> >> incorrectly ...?
>>> >
>>> > I found that it is the function __convert_s4_s1.
>>>
>>> That's strange. If we see different things here, maybe we are running
>>> into some kind of undefined behavior (possibly related to
>>> gfc_bad_expr?). Anyway, after some more debugging I came to the
>>> conclusion that what actually fails is the error propagation, which
>>> seems to be broken in gfc_check_assign and can be fixed like this:
>>>
>>>
>>> Index: gcc/fortran/expr.c
>>> ===
>>> --- gcc/fortran/expr.c(revision 243194)
>>> +++ gcc/fortran/expr.c(working copy)
>>> @@ -3314,9 +3314,9 @@ gfc_check_assign (gfc_expr *lvalue, gfc_expr *rval
>>>if (lvalue->ts.type == BT_CHARACTER && rvalue->ts.type == BT_CHARACTER)
>>>  {
>>>if (lvalue->ts.kind != rvalue->ts.kind && allow_convert)
>>> -gfc_convert_chartype (rvalue, &lvalue->ts);
>>> -
>>> -  return true;
>>> +return gfc_convert_chartype (rvalue, &lvalue->ts);
>>> +  else
>>> +return true;
>>>  }
>>>
>>>if (!allow_convert)
>>>
>>>
>>> This also avoids the ICE and I think is the proper way to fix this ...
>>>
>>
>> When developing the patch, I fixed/avoided the ICE, first.  Then,
>> I tried Gerhard's second testcase without the PARAMETER attribute.
>> An error message is emitted, so I went hunting for why PARAMETER
>> suppressed the error message.  This led to the second part of the
>> patch of changing gfc_error to gfc_error_now.
>
> I think the gfc_error_now should not be necessary with my fix, but
> removing the ATTRIBUTE_UNUSED is certainly a good idea.
>
>
>> In any event, if your patch causes an error message to be emitted,
>> then I think that your patch is better than the one I proposed.
>> Feel free to commit your patch.
>
> I am now regtesting the attached version and, if successful, will commit.

Testing went well. Committed as r243201.

Cheers,
Janus

Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322)

2016-12-02 Thread Uros Bizjak

On Fri, Dec 2, 2016 at 5:30 PM, Jakub Jelinek  wrote:
> On Fri, Dec 02, 2016 at 05:12:20PM +0100, Uros Bizjak wrote:
>> >> This patch:
>> >> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two
>> >>one_cmplsi2 after reload)
>> >> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones)
>> >> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it
>> >>is for -mbmi only, and adds another *andndi3_doubleword pattern that is
>> >>meant to live just from combine till the stv pass, or worse case till
>> >>following split1 pass when it is split back into not followed by and;
>> >>this change makes it possible to use pandn in stv pass, even without
>> >>-mbmi
>> >
>> > Please use attached (lightly tested) patch to implement point 3)
>> > above. The patch splits insn after reload, as is the case with all STV
>> > patterns.
>>
>> Now attached for real.
>
> Ok, I've checked in following patch (compared to your notes just added
> xfail to the pr70322-2.c test scan-assembler), feel free to test your patch
> and remove the xfail again.
>
> 2016-12-02  Jakub Jelinek  
>
> PR target/70322
> * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle
> NOT.
> (dimode_scalar_chain::compute_convert_gain): Likewise.
> (dimode_scalar_chain::convert_insn): Likewise.
> * config/i386/i386.md (*one_cmpldi2_doubleword): New
> define_insn_and_split.
> (one_cmpl2): Use SWIM1248x iterator instead of SWIM.
>
> * gcc.target/i386/pr70322-1.c: New test.
> * gcc.target/i386/pr70322-2.c: New test.
> * gcc.target/i386/pr70322-3.c: New test.

Attached to this message, please find the patch that finally
implements PANDN generation for non-BMI targets.

2016-12-02  Uros Bizjak  

PR target/70322
* config/i386/i386.md (*andndi3_doubleword): Add non-BMI alternative
and corresponding post-reload splitter.

testsuite/ChangeLog:

2016-12-02  Uros Bizjak  

PR target/70322
* gcc.target/i386/pr70322-2.c (dg-final): Remove xfail.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 243196)
+++ config/i386/i386.md (working copy)
@@ -8534,15 +8534,24 @@
   operands[2] = gen_lowpart (QImode, operands[2]);
 })
 
-(define_insn_and_split "*andndi3_doubleword"
-  [(set (match_operand:DI 0 "register_operand" "=r")
+(define_insn "*andndi3_doubleword"
+  [(set (match_operand:DI 0 "register_operand" "=r,&r")
(and:DI
- (not:DI (match_operand:DI 1 "register_operand" "r"))
- (match_operand:DI 2 "nonimmediate_operand" "rm")))
+ (not:DI (match_operand:DI 1 "register_operand" "r,0"))
+ (match_operand:DI 2 "nonimmediate_operand" "rm,rm")))
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2"
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
   "#"
-  "&& reload_completed"
+  [(set_attr "isa" "bmi,*")])
+
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+   (and:DI
+ (not:DI (match_operand:DI 1 "register_operand"))
+ (match_operand:DI 2 "nonimmediate_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_64BIT && TARGET_BMI && TARGET_STV && TARGET_SSE2
+   && reload_completed"
   [(parallel [(set (match_dup 0)
   (and:SI (not:SI (match_dup 1)) (match_dup 2)))
  (clobber (reg:CC FLAGS_REG))])
@@ -8551,6 +8560,24 @@
  (clobber (reg:CC FLAGS_REG))])]
   "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
 
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+   (and:DI
+ (not:DI (match_dup 0))
+ (match_operand:DI 1 "nonimmediate_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_64BIT && !TARGET_BMI && TARGET_STV && TARGET_SSE2
+   && reload_completed"
+  [(set (match_dup 0) (not:SI (match_dup 0)))
+   (parallel [(set (match_dup 0)
+  (and:SI (match_dup 0) (match_dup 1)))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 2) (not:SI (match_dup 2)))
+   (parallel [(set (match_dup 2)
+  (and:SI (match_dup 2) (match_dup 3)))
+ (clobber (reg:CC FLAGS_REG))])]
+  "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);")
+
 (define_insn "*andn_1"
   [(set (match_operand:SWI48 0 "register_operand" "=r,r")
(and:SWI48
Index: testsuite/gcc.target/i386/pr70322-2.c
===
--- testsuite/gcc.target/i386/pr70322-2.c   (revision 243196)
+++ testsuite/gcc.target/i386/pr70322-2.c   (working copy)
@@ -1,7 +1,7 @@
 /* PR target/70322 */
 /* { dg-do compile { target ia32 } } */
 /* { dg-options "-O2 -msse2 -mstv -mno-bmi" } */
-/* { dg-final { scan-assembler "pandn" { xfail *-*-

Re: PR78599

2016-12-02 Thread Martin Jambor

Hi,

On Thu, Dec 01, 2016 at 01:43:16PM +0100, Richard Biener wrote:
> On Thu, Dec 1, 2016 at 11:07 AM, Prathamesh Kulkarni
>  wrote:
> > Hi,
> > As mentioned in PR, the issue seems to be that in
> > propagate_bits_accross_jump_functions(),
> > ipa_get_type() returns record_type during WPA and hence we pass
> > invalid precision to
> > ipcp_bits_lattice::meet_with (value, mask, precision) which eventually
> > leads to runtime error.
> > The attached patch tries to fix that, by bailing out if type of param
> > is not integral or pointer type.
> > This happens for the edge from deque_test -> 
> > _Z4copyIPd1BEvT_S2_T0_.isra.0/9.
> 
> Feels more like a DECL_BY_REFERENCE mishandling and should be fixed elsewhere.

That is indeed what is happening. Prathamesh, if you are going to save
the type of arguments in the jump function, it should help you also
with this issue.  I know it was me who suggested using the function
type to get at them and am sorry, I did not realize there potential
issues with promotions and by_reference passing.

By the way, please be careful not to introduce code style violations,
especially lines exceeding 80 characters and adding trailing
whitespace (propagate_bits_accross_jump_function has a few instances
of both), I'd suggest setting your editor to highlight them.

Thanks,

Martin

Re: [PATCH] PR fortran/78618 -- RANK() should not ICE

2016-12-02 Thread Steve Kargl

On Fri, Dec 02, 2016 at 07:39:33PM +0100, Janus Weil wrote:
> 
> Testing went well. Committed as r243201.
> 

Thanks for reviewing my patch, and more importantly
thanks for your patch.

-- 
Steve

Re: [PATCH 8a/9] Introduce class function_reader (v7)

2016-12-02 Thread Bernd Schmidt


On 12/02/2016 07:44 PM, David Malcolm wrote:

The two flag assignments don't seem to be needed; I think this is due
to adding:

  if (node->native_rtl_p ())
node->force_output = 1;

to cgraph_node::finalize_function in patch 9.

Should I lose them (and the comment)?


Let's keep this patch self-contained and not dependent on #9. So, leave 
them in for now.


I think we can call this one OK for now (the orig_reg thing needs fixing 
and we'll probably want to do something about making locations optional, 
but all that can be done later).



Bernd

Re: PR78599

2016-12-02 Thread Prathamesh Kulkarni

On 3 December 2016 at 00:25, Martin Jambor  wrote:
> Hi,
>
> On Thu, Dec 01, 2016 at 01:43:16PM +0100, Richard Biener wrote:
>> On Thu, Dec 1, 2016 at 11:07 AM, Prathamesh Kulkarni
>>  wrote:
>> > Hi,
>> > As mentioned in PR, the issue seems to be that in
>> > propagate_bits_accross_jump_functions(),
>> > ipa_get_type() returns record_type during WPA and hence we pass
>> > invalid precision to
>> > ipcp_bits_lattice::meet_with (value, mask, precision) which eventually
>> > leads to runtime error.
>> > The attached patch tries to fix that, by bailing out if type of param
>> > is not integral or pointer type.
>> > This happens for the edge from deque_test -> 
>> > _Z4copyIPd1BEvT_S2_T0_.isra.0/9.
>>
>> Feels more like a DECL_BY_REFERENCE mishandling and should be fixed 
>> elsewhere.
>
> That is indeed what is happening. Prathamesh, if you are going to save
> the type of arguments in the jump function, it should help you also
> with this issue.  I know it was me who suggested using the function
> type to get at them and am sorry, I did not realize there potential
> issues with promotions and by_reference passing.
>
> By the way, please be careful not to introduce code style violations,
> especially lines exceeding 80 characters and adding trailing
> whitespace (propagate_bits_accross_jump_function has a few instances
> of both), I'd suggest setting your editor to highlight them.
Oops sorry about that, I will pay more attention to formatting henceforth.
Using editor to highlight stray whitespace is indeed quite useful,
thanks for that suggestion.

Kugan has a patch for adding param type to jump function:
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02732.html
Once that gets committed, I will send a patch to use jfunc->param_type in
propagate_bits_accross_jump_function().

Thanks,
Prathamesh
>
> Thanks,
>
> Martin
>

[PATCH] Add ASSERT_RTX_PTR_EQ

2016-12-02 Thread David Malcolm

On Fri, 2016-12-02 at 16:28 +0100, Bernd Schmidt wrote:
> On 12/01/2016 10:43 PM, David Malcolm wrote:
> > > > +  rtx_insn *jump_insn = get_insn_by_uid (1);
> > > > +  ASSERT_EQ (JUMP_INSN, GET_CODE (jump_insn));
> > > > +  ASSERT_EQ (ret_rtx, JUMP_LABEL (jump_insn));
> > > > +  // FIXME: ^^^ use ASSERT_RTX_PTR_EQ here ^^^
> > > 
> > > Why is this a fixme and not just done that way (several
> > > instances)?
> > 
> > ASSERT_RTX_PTR_EQ doesn't exist yet; there was some discussion
> > about it
> > in earlier versions of the patch, but I haven't written it.  It
> > would
> > be equivalent to ASSERT_EQ, checking pointer equality, but would
> > dump
> > the mismatching expected vs actual rtx on failure.
> > 
> > Should I convert this to a TODO, or go ahead and implement
> > ASSERT_RTX_PTR_EQ?
> 
> Missed this question. Just add ASSERT_RTX_PTR_EQ, that shouldn't be
> hard, should it?
> 
> 
> Bernd

This patch implements an ASSERT_RTX_PTR_EQ macro, like ASSERT_EQ,
but which dumps both rtx to stderr if the assertion fails.

gcc/ChangeLog:
* selftest-rtl.c (selftest::assert_rtx_ptr_eq_at): New function.
* selftest-rtl.h (ASSERT_RTX_PTR_EQ): New macro.
---
 gcc/selftest-rtl.c | 23 +++
 gcc/selftest-rtl.h | 18 ++
 2 files changed, 41 insertions(+)

diff --git a/gcc/selftest-rtl.c b/gcc/selftest-rtl.c
index 20e6550..c14db43 100644
--- a/gcc/selftest-rtl.c
+++ b/gcc/selftest-rtl.c
@@ -36,6 +36,29 @@ along with GCC; see the file COPYING3.  If not see
 
 namespace selftest {
 
+/* Compare rtx EXPECTED and ACTUAL by pointer equality, calling
+   ::selftest::pass if they are equal, aborting if they are non-equal.
+   LOC is the effective location of the assertion, MSG describes it.  */
+
+void
+assert_rtx_ptr_eq_at (const location &loc, const char *msg,
+ rtx expected, rtx actual)
+{
+  if (expected == actual)
+::selftest::pass (loc, msg);
+  else
+{
+  fprintf (stderr, "%s:%i: %s: FAIL: %s\n", loc.m_file, loc.m_line,
+  loc.m_function, msg);
+  fprintf (stderr, "  expected (at %p): ", (void *)expected);
+  print_rtl (stderr, expected);
+  fprintf (stderr, "\n  actual (at %p): ", (void *)actual);
+  print_rtl (stderr, actual);
+  fprintf (stderr, "\n");
+  abort ();
+}
+}
+
 /* Constructor for selftest::rtl_dump_test.
Read a dumped RTL function from PATH.
Takes ownership of PATH, freeing in dtor.
diff --git a/gcc/selftest-rtl.h b/gcc/selftest-rtl.h
index 35d6437..cb2772d 100644
--- a/gcc/selftest-rtl.h
+++ b/gcc/selftest-rtl.h
@@ -47,6 +47,24 @@ assert_rtl_dump_eq (const location &loc, const char 
*expected_dump, rtx x,
   assert_rtl_dump_eq (SELFTEST_LOCATION, (EXPECTED_DUMP), (RTX), \
  (REUSE_MANAGER))
 
+/* Evaluate rtx EXPECTED and ACTUAL and compare them with ==
+   (i.e. pointer equality), calling ::selftest::pass if they are
+   equal, aborting if they are non-equal.  */
+
+#define ASSERT_RTX_PTR_EQ(EXPECTED, ACTUAL) \
+  SELFTEST_BEGIN_STMT  \
+  const char *desc = "ASSERT_RTX_PTR_EQ (" #EXPECTED ", " #ACTUAL ")";  \
+  ::selftest::assert_rtx_ptr_eq_at (SELFTEST_LOCATION, desc, (EXPECTED), \
+   (ACTUAL));  \
+  SELFTEST_END_STMT
+
+/* Compare rtx EXPECTED and ACTUAL by pointer equality, calling
+   ::selftest::pass if they are equal, aborting if they are non-equal.
+   LOC is the effective location of the assertion, MSG describes it.  */
+
+extern void assert_rtx_ptr_eq_at (const location &loc, const char *msg,
+ rtx expected, rtx actual);
+
 /* A class for testing RTL function dumps.  */
 
 class rtl_dump_test
-- 
1.8.5.3

Re: PR78634: ifcvt/i386 cost updates

2016-12-02 Thread James Greenhalgh

On Fri, Dec 02, 2016 at 05:00:05PM +0100, Bernd Schmidt wrote:
> With the i386 backend no longer double-counting the cost of a SET,
> the default implementation default_max_noce_ifcvt_seq_cost now
> provides too high a bound for if conversion, allowing very costly
> substitutions.
> 
> The following patch fixes this by making an i386 variant of the
> hook, but first - James, do you recall how you arrived at the
> COSTS_N_INSNS(3) magic number? What happens on arm if you lower that
> in the default hook?

Hi Bernd,

Given the magic numbers in BRANCH_COST already, 3 was chosen to give a
resonable chance of allowing if-conversion of blocks containing multiple
sets. iWe need to do this because for AArch64 and x86 the conditional
move pattern will expand to two further patterns, each of which will get
a cost (before my cost model patches you would simply count the total
number of expansions you would make, so the multiplication by three is
to compensate)

I wouldn't say there was much scientific to choosing between 2 and 3
beyond looking at cases which worked on x86_64 and AArch64 before I
modified the cost model, and again after, and trying to maintain the
number of such cases which were if-converted.

Setting this to 2 in the generic hook and forcing AArch64 to run with
a custom hook would be just as correct as this patch (though arguably
yours is the better Stage 3 patch as it has more limited scope).

Thanks,
James

> The change in ifcvt.c seems to have no bearing on the PR, I just
> noticed we were missing a cost check in one place.
> 
> Lightly tested so far, will bootstrap & test on x86_64-linux.
> 
> 
> Bernd

>   PR rtl-optimization/78634
>   * config/i386/i386.c (ix86_max_noce_ifcvt_seq_cost): New function.
>   (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Define.
>   * ifcvt.c (noce_try_cmove): Add missing cost check.
> 
> Index: gcc/config/i386/i386.c
> ===
> --- gcc/config/i386/i386.c(revision 242958)
> +++ gcc/config/i386/i386.c(working copy)
> @@ -50301,6 +50301,28 @@ ix86_spill_class (reg_class_t rclass, ma
>return NO_REGS;
>  }
>  
> +/* Implement TARGET_MAX_NOCE_IFCVT_SEQ_COST.  Like the default 
> implementation,
> +   but returns a lower bound.  */
> +
> +static unsigned int
> +ix86_max_noce_ifcvt_seq_cost (edge e)
> +{
> +  bool predictable_p = predictable_edge_p (e);
> +
> +  enum compiler_param param
> += (predictable_p
> +   ? PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST
> +   : PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST);
> +
> +  /* If we have a parameter set, use that, otherwise take a guess using
> + BRANCH_COST.  */
> +  if (global_options_set.x_param_values[param])
> +return PARAM_VALUE (param);
> +  else
> +return BRANCH_COST (true, predictable_p) * COSTS_N_INSNS (2);
> +}
> +
> +
>  /* Implement targetm.vectorize.init_cost.  */
>  
>  static void *
> @@ -51615,6 +51637,8 @@ ix86_run_selftests (void)
>  #undef TARGET_EXPAND_DIVMOD_LIBFUNC
>  #define TARGET_EXPAND_DIVMOD_LIBFUNC ix86_expand_divmod_libfunc
>  
> +#undef TARGET_MAX_NOCE_IFCVT_SEQ_COST
> +#define TARGET_MAX_NOCE_IFCVT_SEQ_COST ix86_max_noce_ifcvt_seq_cost
>  #if CHECKING_P
>  #undef TARGET_RUN_TARGET_SELFTESTS
>  #define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests
> Index: gcc/ifcvt.c
> ===
> --- gcc/ifcvt.c   (revision 242958)
> +++ gcc/ifcvt.c   (working copy)
> @@ -1826,7 +1826,7 @@ noce_try_cmove (struct noce_if_info *if_
>   noce_emit_move_insn (if_info->x, target);
>  
> seq = end_ifcvt_sequence (if_info);
> -   if (!seq)
> +   if (!seq || !noce_conversion_profitable_p (seq, if_info))
>   return FALSE;
>  
> emit_insn_before_setloc (seq, if_info->jump,

[PATCH] Fix tsubst_init error recovery (PR c++/78649)

2016-12-02 Thread Jakub Jelinek

Hi!

Trying to value initialize error_mark_node type ICEs, other spots avoid
calling build_value_init* if the type is error_mark_node.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-02  Jakub Jelinek  

PR c++/78649
* pt.c (tsubst_init): Don't call build_value_init if decl's type
is error_mark_node.

* g++.dg/cpp0x/pr78649.C: New test.

--- gcc/cp/pt.c.jj  2016-11-23 16:58:30.0 +0100
+++ gcc/cp/pt.c 2016-12-02 14:38:43.237451068 +0100
@@ -14082,7 +14082,7 @@ tsubst_init (tree init, tree decl, tree
 
   init = tsubst_expr (init, args, complain, in_decl, false);
 
-  if (!init)
+  if (!init && TREE_TYPE (decl) != error_mark_node)
 {
   /* If we had an initializer but it
 instantiated to nothing,
--- gcc/testsuite/g++.dg/cpp0x/pr78649.C.jj 2016-12-02 14:51:02.057892855 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr78649.C2016-12-02 14:50:42.0 
+0100
@@ -0,0 +1,16 @@
+// PR c++/78649
+// { dg-do compile { target c++11 } }
+
+template  void foo ();
+template 
+void
+test ()
+{
+  T t (foo...); // { dg-error "declared void" }
+}
+
+int
+main ()
+{
+  test ();
+}

Jakub

Re: [PATCH] handle integer overflow/wrapping in printf directives (PR 78622)

2016-12-02 Thread Jeff Law


On 12/01/2016 07:31 PM, Martin Sebor wrote:

On 12/01/2016 02:15 AM, Jakub Jelinek wrote:

On Thu, Dec 01, 2016 at 08:26:47AM +0100, Jakub Jelinek wrote:

Isn't this too simplistic?  I mean, if you have say dirtype of signed
char
and argmin say 4096 + 32 and argmax say 4096 + 64, (signed char) arg
has range 32, 64, while I think your routine will yield -128, 127 (well,
0 as min and -128 as max as that is what you return for signed type).

Can't you subtract argmax - argmin (best just in wide_int, no need to
build
trees), and use what you have just for the case where that number
doesn't
fit into the narrower precision, otherwise if argmin - (dirtype) argmin
== argmax - (dirtype) argmax, just use (dirtype) argmin and (dirtype)
argmax
as the range, and in case that it crosses a boundary figure if you
can do
anything than the above?  Guess all cases of signed/unsigned dirtype
and/or
argtype need to be considered.


Richard noted that you could have a look at CONVERT_EXPR_CODE_P
handling in extract_range_from_unary_expr.  I think it is the
  || (vr0.type == VR_RANGE
  && integer_zerop (int_const_binop (RSHIFT_EXPR,
   int_const_binop (MINUS_EXPR, vr0.max, vr0.min),
 size_int (TYPE_PRECISION (outer_type)))
part that is important here for the narrowing conversion.


Thanks, that was a helpful suggestion!  Attached is an update that
follows the vrp approach.  I assume the infinity stuff is specific
to the VRP pass and not something I need to worry about here.

Martin

PS To your earlier question, argmin and argmax have the same meaning
in the added helper function as in its caller.

gcc-78622.diff


PR middle-end/78622 - [7 Regression] -Wformat-length/-fprintf-return-value 
incorrect with overflow/wrapping

gcc/ChangeLog:

PR middle-end/78622
* gimple-ssa-sprintf.c (min_bytes_remaining): Use res.knownrange
rather than res.bounded.
(adjust_range_for_overflow): New function.
(format_integer): Always set res.bounded to true unless either
precision or width is specified and unknown.
Call adjust_range_for_overflow.
(format_directive): Remove vestigial quoting.
(add_bytes): Correct the computation of boundrange used to
decide whether a warning is of a "maybe" or "defnitely" kind.

s/defnitely/definitely/



gcc/testsuite/ChangeLog:

PR middle-end/78622
* gcc.c-torture/execute/pr78622.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-2.c: Remove "benign" undefined
behavior inadvertently introduced in a previous commit.  Tighten
up final checking.
* gcc.dg/tree-ssa/builtin-sprintf-5.c: Rename macros for clarity.
Add test cases.
* gcc.dg/tree-ssa/builtin-sprintf-6.c: Add test cases.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-6.c: Remove xfails and
add a final optimization check.
* gcc.dg/tree-ssa/builtin-sprintf.c: Add test cases.





diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 99a635a..95a8710 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -795,6 +795,59 @@ get_width_and_precision (const conversion_spec &spec,
   *pprec = prec;
 }

+/* With the range [*ARGMIN, *ARGMAX] of an integer directive's actual
+   argument, due to the conversion from either *ARGMIN or *ARGMAX to
+   the type of the directive's formal argument it's possible for both
+   to result in the same number of bytes or a range of bytes that's
+   less than the number of bytes that would result from formatting
+   some other value in the range [*ARGMIN, *ARGMAX].  This can be
+   determined by checking for the actual argument being in the range
+   of the type of the directive.  If it isn't it must be assumed to
+   take on the full range of the directive's type.
+   Return true when the range has been adjusted to the range of
+   DIRTYPE, false otherwise.  */

I wish I'd counted how many times I read that.


+
+static bool
+adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax)
+{
+  unsigned argprec = TYPE_PRECISION (TREE_TYPE (*argmin));
+  unsigned dirprec = TYPE_PRECISION (dirtype);
+
+  /* The logic here was inspired/lifted from  the CONVERT_EXPR_CODE_P
+ branch in the extract_range_from_unary_expr function in tree-vrp.c.
+  */
Formatting it.  If the comment-close won't fit, then line wrap prior to 
the last word in the comment.




+
+  if (TREE_CODE (*argmin) == INTEGER_CST
+  && TREE_CODE (*argmax) == INTEGER_CST
+  && (dirprec >= argprec
+ || integer_zerop (int_const_binop (RSHIFT_EXPR,
+int_const_binop (MINUS_EXPR,
+ *argmax,
+ *argmin),
+size_int (dirprec)
+  {
+

Re: [PATCH] enable -Wformat-length for dynamically allocated buffers (pr 78245)

2016-12-02 Thread Jeff Law


On 11/29/2016 08:22 PM, Martin Sebor wrote:

That said, I defer to you on how to proceed here.  I'm prepared
to do the work(*) but I do worry about jeopardizing the chances
of this patch and the others making it into 7.0.

So would it make sense to just init/fini the b_o_s framework in your
pass and for builtin expansion?


I think that should work for the sprintf checking.  Let me test it.
We can deal with the memxxx and strxxx patch (53562) independently
if you prefer.


Attached is a modified patch that calls {init,fini}_object_sizes()
from the gimple-ssa-sprintf pass instead.

While this works fine, I do like the approach of making the calls
in a single function better because it makes for a more robust API.
Decoupling the init/fini calls from the compute_object_size()
function that depends on them having been made makes the API easier
to accidentally misuse by calling one while forgetting to call one
or both of the other two.
It's not ideal, nor is the prospect of caching and potentially not 
invaliding properly.


I've started tackling these kinds problems by wrapping everything into a 
class with suitable ctors/dtors and methods.  With everything locked 
down inside a class, the only way to access the subsystem is by 
instantiating a suitable object (which obviously gives us control over 
init/fini).  The problem then boils down to not having that instantiated 
object live across passes, which usually isn't a problem in GCC :-)


I suggested it as a possibility, but wasn't going to demand it without 
knowing much more about the code in tree-object-size and how well it 
could be encapsulated.


ANyway, I'll take another look at the patch.  My recollection was that 
the only issue at hand was the init/fini/caching aspects.



jeff

Re: [PATCH] Turn on gnu-indirect-function by default on PowerPC servers

2016-12-02 Thread Michael Meissner

On Thu, Nov 24, 2016 at 01:07:04AM +, Joseph Myers wrote:
> On Wed, 23 Nov 2016, Michael Meissner wrote:
> 
> > Since some of the embedded hosts use powerpc-*-linux, I only set 
> > the
> > default if the target is a 64-bit PowerPC system.  I tested the compiler
> > manually to verify that ifunc support was enabled, and it was.  I did a
> > boostrap build/check cycle on a little endian power8 system and there were
> > regressions.  Can I install the patch on the trunk?
> > 
> > Note, the documentation for the --enable-gnu-indirect-option does not list
> > which systems current enable the option, so I did not modify the
> > documentation.
> 
> I don't think this should have anything to do with embedded/server, or 
> about whether it's configured for 32-bit or 64-bit default.  
> powerpc-linux-gnu --enable-targets=all -m64 should behave the same as 
> powerpc64-linux-gnu.

Since I don't test the embedded environments, I was just trying to cautious
about changes that affect other PowerPC users.

This patch enables --enable-gnu-indirect-function on all PowerPC linux systems,
except for targetting Android or uclib.  It is exactly the same code as we use
in the i[34567]86-*-linux, x86_64-*-linux*, s390-*-linux*, and s390x-*-linux*
systems.  Given the ifunc support is not generated unless the user uses it, it
should not affect other PowerPC users.

Is this patch ok to check into the trunk?

2016-12-02  Michael Meissner  

* config.gcc (powerpc*-*-linux*): Set gnu-indirect-function by
default on PowerPC linux systems.

Index: gcc/config.gcc
===
--- gcc/config.gcc  
(svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config.gcc)   (revision 
243202)
+++ gcc/config.gcc  (.../gcc/config.gcc)(working copy)
@@ -2443,6 +2443,14 @@ powerpc*-*-linux*)
if test x${enable_secureplt} = xyes; then
tm_file="rs6000/secureplt.h ${tm_file}"
fi
+   # Assume modern glibc if not targeting Android nor uclibc.
+   case ${target} in
+   *-*-*android*|*-*-*uclibc*|*-*-*musl*)
+   ;;
+   *)
+   default_gnu_indirect_function=yes
+   ;;
+   esac
;;
 powerpc-wrs-vxworks|powerpc-wrs-vxworksae|powerpc-wrs-vxworksmils)
tm_file="${tm_file} elfos.h freebsd-spec.h rs6000/sysv4.h"


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] Fix tsubst_init error recovery (PR c++/78649)

2016-12-02 Thread Jason Merrill

OK.

On Fri, Dec 2, 2016 at 3:41 PM, Jakub Jelinek  wrote:
> Hi!
>
> Trying to value initialize error_mark_node type ICEs, other spots avoid
> calling build_value_init* if the type is error_mark_node.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-12-02  Jakub Jelinek  
>
> PR c++/78649
> * pt.c (tsubst_init): Don't call build_value_init if decl's type
> is error_mark_node.
>
> * g++.dg/cpp0x/pr78649.C: New test.
>
> --- gcc/cp/pt.c.jj  2016-11-23 16:58:30.0 +0100
> +++ gcc/cp/pt.c 2016-12-02 14:38:43.237451068 +0100
> @@ -14082,7 +14082,7 @@ tsubst_init (tree init, tree decl, tree
>
>init = tsubst_expr (init, args, complain, in_decl, false);
>
> -  if (!init)
> +  if (!init && TREE_TYPE (decl) != error_mark_node)
>  {
>/* If we had an initializer but it
>  instantiated to nothing,
> --- gcc/testsuite/g++.dg/cpp0x/pr78649.C.jj 2016-12-02 14:51:02.057892855 
> +0100
> +++ gcc/testsuite/g++.dg/cpp0x/pr78649.C2016-12-02 14:50:42.0 
> +0100
> @@ -0,0 +1,16 @@
> +// PR c++/78649
> +// { dg-do compile { target c++11 } }
> +
> +template  void foo ();
> +template 
> +void
> +test ()
> +{
> +  T t (foo...); // { dg-error "declared void" }
> +}
> +
> +int
> +main ()
> +{
> +  test ();
> +}
>
> Jakub

Re: [PATCH] enable -Wformat-length for dynamically allocated buffers (pr 78245)

2016-12-02 Thread Jeff Law


On 11/29/2016 08:22 PM, Martin Sebor wrote:

That said, I defer to you on how to proceed here.  I'm prepared
to do the work(*) but I do worry about jeopardizing the chances
of this patch and the others making it into 7.0.

So would it make sense to just init/fini the b_o_s framework in your
pass and for builtin expansion?


I think that should work for the sprintf checking.  Let me test it.
We can deal with the memxxx and strxxx patch (53562) independently
if you prefer.


Attached is a modified patch that calls {init,fini}_object_sizes()
from the gimple-ssa-sprintf pass instead.

While this works fine, I do like the approach of making the calls
in a single function better because it makes for a more robust API.
Decoupling the init/fini calls from the compute_object_size()
function that depends on them having been made makes the API easier
to accidentally misuse by calling one while forgetting to call one
or both of the other two.

Martin


gcc-78245.diff


PR middle-end/78245 - missing -Wformat-length on an overflow of a dynamically 
allocated buffer

gcc/testsuite/ChangeLog:

PR middle-end/78245
* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Add tests.

gcc/ChangeLog:

PR middle-end/78245
* gimple-ssa-sprintf.c (get_destination_size): Call
compute_object_size.
* tree-object-size.c (addr_object_size): Adjust.
(pass_through_call): Adjust.
(internal_object_size): New function.
(compute_builtin_object_size): Call internal_object_size.
(pass_object_sizes::execute): Adjust.
* tree-object-size.h (fini_object_sizes): Declare.

@@ -664,6 +665,18 @@ compute_builtin_object_size (tree ptr, int 
object_size_type,
   return *psize != unknown[object_size_type];
 }

+/* Compute __builtin_object_size value for PTR and set *PSIZE to
+   the resulting value.  OBJECT_SIZE_TYPE is the second argument
+   to __builtin_object_size.  Return true on success and false
+   when the object size could not be determined.  */
+
+bool
+compute_builtin_object_size (tree ptr, int object_size_type,
+unsigned HOST_WIDE_INT *psize)
+{
+  return internal_object_size (ptr, object_size_type, psize);
+}
Is this wrapper still necessary now that we've pulled the init/fini 
routines out?  Seems like it shouldn't be.



I think that's the only issue left.  If the wrapper is needed, then the 
patch is fine.  If the wrapper isn't, then a patch with the wrapper 
removed is pre-approved after the usual testing.


jeff

Re: [PATCH] Fix PR78306

2016-12-02 Thread Jeff Law


On 11/30/2016 07:18 AM, Richard Biener wrote:



so progess on the OMP front hides regressions elsewhere.  The patch
that started this thread does not have any effect on the conformance
testsuite result.

That makes it even more obvious that it is a) unmaintained, b) probably
not used widely as those ICEs have not been reported as bugs

Let's deprecate Cilk+ if no maintainer steps up until 7.1 is released.
Gerald, can you relay that to the SC?
I've raised the issue with the steering committee; I will also speak 
with Intel about the possibility of them providing resources to maintain 
this code.


jeff

Re: [PATCH] PR target/78639, Fix code generation on Power9

2016-12-02 Thread Segher Boessenkool

On Thu, Dec 01, 2016 at 07:46:31PM -0500, Michael Meissner wrote:
> The previous code before 242679 used 'wY', but I deleted the 'w' by accident.
> 
> This bug showed up in compiling PUGHReduce/Reduction.c in the cactusADM Spec
> 2006 benchmark suite.  I have verified that this fixes the problem.
> 
> Assuming the bootstrap and make check that I'm running right now don't cause
> any regressions, can I check the patch into the trunk?

Yes of course.  Thanks,


Segher


> 2016-12-01  Michael Meissner  
> 
>   PR target/78639
>   * config/rs6000/rs6000.md (movdi_internal64): Fix typo in
>   subversion id 242679 that causes the wrong store instruction to be
>   generated if a DImode is in an Altivec register using REG+REG
>   addressing.

[Committed,fortran] Plug memory leak.

2016-12-02 Thread Steve Kargl

While looking at code related to PR fortran/78618, I 
noticed a memory leak.  The following patch plus the
leak.

2016-12-02  Steven G. Kargl  

* simplify.c (gfc_convert_char_constant): Free result on error.

Index: simplify.c
===
--- simplify.c  (revision 243203)
+++ simplify.c  (working copy)
@@ -7152,6 +7152,7 @@ gfc_convert_char_constant (gfc_expr *e, 
   "into character kind %d",
   gfc_print_wide_char (result->value.character.string[i]),
   &e->where, kind);
+   gfc_free_expr (result);
return &gfc_bad_expr;
  }
 
Without the patch, valgrind finds

==4426== LEAK SUMMARY:
==4426==definitely lost: 192 bytes in 1 blocks
==4426==indirectly lost: 8 bytes in 1 blocks
==4426==  possibly lost: 6,256 bytes in 21 blocks
==4426==still reachable: 817,676 bytes in 2,185 blocks
==4426== suppressed: 0 bytes in 0 blocks

With the patch, she shows

==15883== LEAK SUMMARY:
==15883==definitely lost: 0 bytes in 0 blocks
==15883==indirectly lost: 0 bytes in 0 blocks
==15883==  possibly lost: 6,256 bytes in 21 blocks
==15883==still reachable: 817,676 bytes in 2,185 blocks
==15883== suppressed: 0 bytes in 0 blocks

-- 
Steve

Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c

2016-12-02 Thread Segher Boessenkool

On Fri, Dec 02, 2016 at 09:47:00AM +0100, Richard Biener wrote:
> >> STC tries to make as large as possible consecutive "traces", mainly to
> >> help with instruction cache utilization and hit rate etc.  It cannot do
> >> a very good job if it isn't allowed to copy blocks.
> >>
> >> "simple" tries to (dynamically) have as many fall-throughs as possible,
> >> i.e. as few jumps as possible.
> 
> I hope it special-cases backedges, that is, not make those fallthru.

It doesn't, and that is a good thing.

Firstly, what is classified as backedge doesn't make too much sense in
some cases; quite often the cfg isn't reducible.

Secondly, for simple loops it does not matter: the backedge has lower
frequency than the forward edges, so everything works out as you want.

Thirdly, consider this loop:

|
S<
| |
A |
   / \|
  B   C   |
   \ /|
D |
| |
E-
|
F

where the backedge now has higher frequency than A->B and A->C.
With simple this ends up as:

S: ...; goto A

D: ...
E: ...; if (not again) goto F
S: ...
A: ...; if () goto C
B: ...; goto D

C: ...; goto D

and this is cheaper than the alternative:

S: ...
A: ...; if () goto C
B: ...
D: ...
E: ...; if (again) goto A
F:

C: ...; goto D

If a fraction f of the loop iterations does C (so 1-f does B), the
alternative has 1+2f jumps per iteration; the code simple makes has 1+f.

> > I think we've probably discussed this more than is really necessary.  We
> > just need to pick an alternative and go with it, I think either alternative
> > is reasonable (avoid copying when STC has no length information or fall back
> > to simple when there is no length information).
> 
> The description of both algorithms doesn't make it an obvious pick.  So lets
> leave the decision of the algorithm to the target and make STC beave sane
> with no length information (whether that means disallowing any copying or
> substituting a "default" length is another question -- but obviously it's the
> targets fault to not provide lenght information).

I agree.


Segher

Re: [PATCH] Turn on gnu-indirect-function by default on PowerPC servers

2016-12-02 Thread Joseph Myers

On Fri, 2 Dec 2016, Michael Meissner wrote:

> This patch enables --enable-gnu-indirect-function on all PowerPC linux 
> systems,
> except for targetting Android or uclib.  It is exactly the same code as we use
> in the i[34567]86-*-linux, x86_64-*-linux*, s390-*-linux*, and s390x-*-linux*
> systems.  Given the ifunc support is not generated unless the user uses it, it
> should not affect other PowerPC users.
> 
> Is this patch ok to check into the trunk?

This is not a review, but doing things the same as on other architectures 
makes sense to me.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] Even more minimal reimplementation of errors.c within read-md.c

2016-12-02 Thread David Malcolm

On Thu, 2016-12-01 at 13:40 +0100, Bernd Schmidt wrote:
> On 11/30/2016 09:24 PM, David Malcolm wrote:
> 
> > gcc/ChangeLog:
> > * read-md.c (have_error): New global, copied from errors.c.
> > (fatal): New function, copied from errors.c.
> 
> I would have expected the function to go into diagnostic.c, but
> actually
> there are already various functions of this sort in read-md.

In a way, there are three diagnostic systems in the codebase:
- the functions in errors.h/errors.c (warning, error, fatal, etc)
- the functions in read-md.c (error_at, fatal_at) which share the
  "have_error" global with errors.c.
- the "real" diagnostics subsystem (diagnostics.c etc)

It turns out that the only thing using "fatal" within read-md.c is
md_reader::read_md_files.  The interface this provides is overly
complicated for the RTL FE's purposes, so the following patch avoids
using it in favor of a simpler, new method: md_reader::read_file.

With that, the only thing needed in read-md.c from errors.c on the host
is just the "have_error" global; the patch verifies that by
conditionalizing the include of errors.h on #ifdef GENERATOR_FILE
(and similarly conditionalizing md_reader::read_md_files, since
"fatal" isn't implemented on the host).

> I'd request
> you place it near fatal_at, and maybe add this to errors.h:
> 
> inline bool seen_error ()
> {
>return have_error;
> }
> 
> and replace explicit uses of have_error with that to unify things a
> bit.
> 
> 
> Bernd

seen_error is already implemented in the "real" diagnostic subsystem,
and would be a clash: we want a way to determine if read-md.c's
implementation of error_at was called.  Hence it seems simplest to
conditionally add a copy of the "have_error" global to read-md.c.

Thoughts?

gcc/ChangeLog:
* read-md.c: Wrap include of errors.h with #ifdef GENERATOR_FILE.
(have_error): New global, copied from errors.c.
(md_reader::read_md_files): Wrap with #ifdef GENERATOR_FILE.
(md_reader::read_file): New method.
* read-md.h (md_reader::read_file): New method.
* read-rtl-function.c (read_rtl_function_body): Reimplement in
terms of md_reader::read_file.
---
 gcc/read-md.c   | 31 +++
 gcc/read-md.h   |  1 +
 gcc/read-rtl-function.c |  6 +-
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/gcc/read-md.c b/gcc/read-md.c
index 25bc3c4..e581326 100644
--- a/gcc/read-md.c
+++ b/gcc/read-md.c
@@ -26,11 +26,23 @@ along with GCC; see the file COPYING3.  If not see
 #endif
 #include "system.h"
 #include "coretypes.h"
+#ifdef GENERATOR_FILE
 #include "errors.h"
+#endif /* #ifdef GENERATOR_FILE */
 #include "statistics.h"
 #include "vec.h"
 #include "read-md.h"
 
+#ifndef GENERATOR_FILE
+
+/* Minimal reimplementation of errors.c for use by RTL frontend
+   within cc1.  */
+
+int have_error = 0;
+
+#endif /* #ifndef GENERATOR_FILE */
+
+
 /* Associates PTR (which can be a string, etc.) with the file location
specified by FILENAME and LINENO.  */
 struct ptr_loc {
@@ -1190,6 +1202,8 @@ md_reader::add_include_path (const char *arg)
   m_last_dir_md_include_ptr = &dirtmp->next;
 }
 
+#ifdef GENERATOR_FILE
+
 /* The main routine for reading .md files.  Try to process all the .md
files specified on the command line and return true if no error occurred.
 
@@ -1296,6 +1310,23 @@ md_reader::read_md_files (int argc, const char **argv,
   return !have_error;
 }
 
+#endif /* #ifdef GENERATOR_FILE */
+
+/* Read FILENAME.  */
+
+bool
+md_reader::read_file (const char *filename)
+{
+  m_read_md_filename = filename;
+  m_read_md_file = fopen (m_read_md_filename, "r");
+  if (m_read_md_file == 0)
+{
+  perror (m_read_md_filename);
+  return false;
+}
+  handle_toplevel_file ();
+  return !have_error;
+}
 
 /* Read FILENAME, filtering to just the given lines.  */
 
diff --git a/gcc/read-md.h b/gcc/read-md.h
index 6a73b00..5fc7605 100644
--- a/gcc/read-md.h
+++ b/gcc/read-md.h
@@ -110,6 +110,7 @@ class md_reader
   virtual ~md_reader ();
 
   bool read_md_files (int, const char **, bool (*) (const char *));
+  bool read_file (const char *filename);
   bool read_file_fragment (const char *filename,
   int first_line,
   int last_line);
diff --git a/gcc/read-rtl-function.c b/gcc/read-rtl-function.c
index ddea836..5188b86 100644
--- a/gcc/read-rtl-function.c
+++ b/gcc/read-rtl-function.c
@@ -1590,12 +1590,8 @@ read_rtl_function_body (const char *path)
   init_emit ();
   init_varasm_status ();
 
-  auto_vec argv (2);
-  argv.safe_push (progname);
-  argv.safe_push (path);
-
   function_reader reader;
-  if (!reader.read_md_files (argv.length (), argv.address (), NULL))
+  if (!reader.read_file (path))
 return false;
 
   return true;
-- 
1.8.5.3

[committed] selftest.c: remove calls to strndup (PR bootstrap/78616)

2016-12-02 Thread David Malcolm

As part of the fix for PR c/78498 I added selftests for xstrndup in
r243030, and seeing the implementation of strndup in libiberty, I
also made the selftests verify strndup.

This turned out to be overzealous, as strndup is not necessarily available
in every configuration when the selftests run.

This patch removes the testing for strndup (but not for xstrndup).

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
Committed to trunk as r243207 (as "obvious")

gcc/ChangeLog:
PR bootstrap/78616
* selftest.c (selftest::assert_strndup_eq): Rename to...
(selftest::assert_xstrndup_eq): ...this, and remove call to
strndup.
(selftest::test_strndup): Rename to...
(selftest::test_xstrndup): ...this, updating for above renaming.
(selftest::test_libiberty): Update for renaming.
---
 gcc/selftest.c | 40 +---
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/gcc/selftest.c b/gcc/selftest.c
index 6df73c2..40c6cb5 100644
--- a/gcc/selftest.c
+++ b/gcc/selftest.c
@@ -200,41 +200,35 @@ read_file (const location &loc, const char *path)
 
 /* Selftests for libiberty.  */
 
-/* Verify that both strndup and xstrndup generate EXPECTED
-   when called on SRC and N.  */
+/* Verify that xstrndup generates EXPECTED when called on SRC and N.  */
 
 static void
-assert_strndup_eq (const char *expected, const char *src, size_t n)
+assert_xstrndup_eq (const char *expected, const char *src, size_t n)
 {
-  char *buf = strndup (src, n);
-  if (buf)
-ASSERT_STREQ (expected, buf);
-  free (buf);
-
-  buf = xstrndup (src, n);
+  char *buf = xstrndup (src, n);
   ASSERT_STREQ (expected, buf);
   free (buf);
 }
 
-/* Verify that strndup and xstrndup work as expected.  */
+/* Verify that xstrndup works as expected.  */
 
 static void
-test_strndup ()
+test_xstrndup ()
 {
-  assert_strndup_eq ("", "test", 0);
-  assert_strndup_eq ("t", "test", 1);
-  assert_strndup_eq ("te", "test", 2);
-  assert_strndup_eq ("tes", "test", 3);
-  assert_strndup_eq ("test", "test", 4);
-  assert_strndup_eq ("test", "test", 5);
+  assert_xstrndup_eq ("", "test", 0);
+  assert_xstrndup_eq ("t", "test", 1);
+  assert_xstrndup_eq ("te", "test", 2);
+  assert_xstrndup_eq ("tes", "test", 3);
+  assert_xstrndup_eq ("test", "test", 4);
+  assert_xstrndup_eq ("test", "test", 5);
 
   /* Test on an string without zero termination.  */
   const char src[4] = {'t', 'e', 's', 't'};
-  assert_strndup_eq ("", src, 0);
-  assert_strndup_eq ("t", src, 1);
-  assert_strndup_eq ("te", src, 2);
-  assert_strndup_eq ("tes", src, 3);
-  assert_strndup_eq ("test", src, 4);
+  assert_xstrndup_eq ("", src, 0);
+  assert_xstrndup_eq ("t", src, 1);
+  assert_xstrndup_eq ("te", src, 2);
+  assert_xstrndup_eq ("tes", src, 3);
+  assert_xstrndup_eq ("test", src, 4);
 }
 
 /* Run selftests for libiberty.  */
@@ -242,7 +236,7 @@ test_strndup ()
 static void
 test_libiberty ()
 {
-  test_strndup ();
+  test_xstrndup ();
 }
 
 /* Selftests for the selftest system itself.  */
-- 
1.8.5.3

Re: [PATCH] handle integer overflow/wrapping in printf directives (PR 78622)

2016-12-02 Thread Martin Sebor


Thanks for looking at this!  I realize it's dense code and not easy
to make sense out of.


PR middle-end/78622 - [7 Regression]
-Wformat-length/-fprintf-return-value incorrect with overflow/wrapping

gcc/ChangeLog:

PR middle-end/78622
* gimple-ssa-sprintf.c (min_bytes_remaining): Use res.knownrange
rather than res.bounded.
(adjust_range_for_overflow): New function.
(format_integer): Always set res.bounded to true unless either
precision or width is specified and unknown.
Call adjust_range_for_overflow.
(format_directive): Remove vestigial quoting.
(add_bytes): Correct the computation of boundrange used to
decide whether a warning is of a "maybe" or "defnitely" kind.

s/defnitely/definitely/


Will fix.


+/* With the range [*ARGMIN, *ARGMAX] of an integer directive's actual
+   argument, due to the conversion from either *ARGMIN or *ARGMAX to
+   the type of the directive's formal argument it's possible for both
+   to result in the same number of bytes or a range of bytes that's
+   less than the number of bytes that would result from formatting
+   some other value in the range [*ARGMIN, *ARGMAX].  This can be
+   determined by checking for the actual argument being in the range
+   of the type of the directive.  If it isn't it must be assumed to
+   take on the full range of the directive's type.
+   Return true when the range has been adjusted to the range of
+   DIRTYPE, false otherwise.  */

I wish I'd counted how many times I read that.


Let me see if I can word it better.




+
+static bool
+adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax)
+{
+  unsigned argprec = TYPE_PRECISION (TREE_TYPE (*argmin));
+  unsigned dirprec = TYPE_PRECISION (dirtype);
+
+  /* The logic here was inspired/lifted from  the CONVERT_EXPR_CODE_P
+ branch in the extract_range_from_unary_expr function in tree-vrp.c.
+  */

Formatting it.  If the comment-close won't fit, then line wrap prior to
the last word in the comment.


Okay.  I didn't remember what the exact rules were.


+
+  if (TREE_CODE (*argmin) == INTEGER_CST
+  && TREE_CODE (*argmax) == INTEGER_CST
+  && (dirprec >= argprec
+  || integer_zerop (int_const_binop (RSHIFT_EXPR,
+ int_const_binop (MINUS_EXPR,
+  *argmax,
+  *argmin),
+ size_int (dirprec)
+  {
+*argmin = force_fit_type (dirtype, wi::to_widest (*argmin), 0,
false);
+*argmax = force_fit_type (dirtype, wi::to_widest (*argmax), 0,
false);
+return false;
+  }

So in this case we're not changing the values, we're just converting
them to a equal or wider type, right?  Thus no adjustment (what about a
sign change?  Is that an adjustment?) and we return false per the
function's specifications.


This casts the value to the destination type, so it may result in
different values.  The important postcondition it preserves is that
the difference ARGMAX - ARGMIN is less than or equal to TYPE_MAX of
the directive's unsigned TYPE.  (If it isn't, the range cannot be
converted.)  This lets us take, say:

  int f (int i)
  {
if (i < 1024 || 1033 < i)
   i = 1024;

return snprintf (0, 0, "%hhi", i);
  }

and fold the function call into 1 because (signed char)1024 is 0 and
(signed char)1033 is 9, and every other value in that same original
range is also in the same range after the conversion.  We know it's
safe because (1033 - 1024 <= UCHAR_MAX) holds.  But in in this:

  int g (int i)
  {
if (i < 1024 || 1289 < i)
   i = 1024;

return snprintf (0, 0, "%hhi", i);
  }

even though (signed char)1024 is 0 and (signed char)1289 is also 9
like above, since (1289 - 1024) is 265 and thus greater than UCHAR_MAX,
folding the result wouldn't be safe (for example, (sighed char)1034
yields 10).

I picture this as a window bounded by the range of the directive's
type sliding within another window bounded by the argument's type:

  [<-[...dirtype...]--->]

the outer brackets delimit the range of the argument and the inner
ones the directive's.  The smaller window can slide left and right
and it can shrink but it can't be wider that the directive's type's
range.


What about overflows when either argmin or argmax won't fit into dirtype
without an overflow?  What's the right behavior there?


That's fine just as long as the property above holds.

I think the algorithm works but the code came from tree-vrp where
there are all sorts of additional checks for some infinities which
VRP distinguishes from type limits like SCHAR_MIN and _MAX.  I don't
fully understand those and so I'd feel better if someone who does
double checked this code.




+
+  tree dirmin = TYPE_MIN_VALUE (dirtype);
+  tree dirmax = TYPE_MAX_VALUE (dirtype);

From this point onward argmin/argmax are either not integers or we're
doing a narrowing conversion, right?  In both cases the fact that we're
doing a narrowing conversion constrains th

Re: [PATCH] Turn on gnu-indirect-function by default on PowerPC servers

2016-12-02 Thread Segher Boessenkool

On Fri, Dec 02, 2016 at 04:09:22PM -0500, Michael Meissner wrote:
> On Thu, Nov 24, 2016 at 01:07:04AM +, Joseph Myers wrote:
> This patch enables --enable-gnu-indirect-function on all PowerPC linux 
> systems,
> except for targetting Android or uclib.  It is exactly the same code as we use
> in the i[34567]86-*-linux, x86_64-*-linux*, s390-*-linux*, and s390x-*-linux*
> systems.  Given the ifunc support is not generated unless the user uses it, it
> should not affect other PowerPC users.
> 
> Is this patch ok to check into the trunk?

Assuming you tested it, okay.  Thanks,


Segher


> 2016-12-02  Michael Meissner  
> 
>   * config.gcc (powerpc*-*-linux*): Set gnu-indirect-function by
>   default on PowerPC linux systems.

1 2 >

1 - 100 of 108 matches

Mail list logo