Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

2019-06-27 Thread Li Jia He




On 2019/6/27 11:48 PM, Jeff Law wrote:

On 6/27/19 12:11 AM, Li Jia He wrote:

Hi,

According to the optimizable case described by Qi Feng on
issue 88784, we can combine the cases into the following:

1. x >  y  &&  x != XXX_MIN  -->   x > y
2. x >  y  &&  x == XXX_MIN  -->   false
3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN

4. x <  y  &&  x != XXX_MAX  -->   x < y
5. x <  y  &&  x == XXX_MAX  -->   false
6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX

7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
8. x <= y  ||  x != XXX_MIN  -->   true
9. x <= y  ||  x == XXX_MIN  -->   x <= y

10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
11. x >= y  ||  x != XXX_MAX  -->   true
12. x >= y  ||  x == XXX_MAX  -->   x >= y

Note: XXX_MIN represents the minimum value of type x.
   XXX_MAX represents the maximum value of type x.

Here we don't need to care about whether the operation is
signed or unsigned.  For example, in the below equation:

'x >  y  &&  x != XXX_MIN  -->   x > y'

If the x type is signed int and XXX_MIN is INT_MIN, we can
optimize it to 'x > y'.  However, if the type of x is unsigned
int and XXX_MIN is 0, we can still optimize it to 'x > y'.

The regression testing for the patch was done on GCC mainline on

 powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regressions.  Is it OK for trunk ?

Thanks,
Lijia He

gcc/ChangeLog

2019-06-27  Li Jia He  
Qi Feng  

PR middle-end/88784
* gimple-fold.c (and_comparisons_contain_equal_operands): New function.
(and_comparisons_1): Use and_comparisons_contain_equal_operands.
(or_comparisons_contain_equal_operands): New function.
(or_comparisons_1): Use or_comparisons_contain_equal_operands.

Would this be better done via match.pd?  ISTM this transformation would
be well suited for that framework.


Hi, Jeff

I did this because of the following test case:
`
_Bool comp(unsigned x, unsigned y)
{
  return x > y && x != 0;
}
`
The gimple file dumped on the power platform is:
`
comp (unsigned int x, unsigned int y)
{
  _Bool D.2837;
  int iftmp.0;

  if (x > y) goto ; else goto ;
  :
  if (x != 0) goto ; else goto ;
  :
  iftmp.0 = 1;
  goto ;
  :
  iftmp.0 = 0;
  :
  D.2837 = (_Bool) iftmp.0;
  return D.2837;
}
`
However, the gimple file dumped on x86 is
`
comp (unsigned int x, unsigned int y)
{
  _Bool D.2837;

  _1 = x > y;
  _2 = x != 0;
  _3 = _1 & _2;
  _4 = (int) _3;
  D.2837 = (_Bool) _4;
  return D.2837;
}
`

The reason for the inconsistency between these two behaviors is param
logical-op-non-short-circuit.  If we add the pattern to the match.pd
file, we can only optimize the situation in which the statement is in
the same basic block (logical-op-non-short-circuit=1, x86).  But for
a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
can't handle this situation.

Another reason is that I found out maybe_fold_and_comparisons and
maybe_fold_or_comparisons are not only called by ifcombine pass but
also by reassoc pass. Using this method can basically unify param
logical-op-non-short-circuit=0 or 1.

Thanks,
Lijia He



jeff





[PATCH] Builtin function roundeven folding implementation

2019-06-27 Thread Tejas Joshi
Hi.
This patch includes implementation of new function roundeven along
with two utility functions. The patch bootstraps on x86_64-linux-gnu
and passes regression tests.

Thanks,
Tejas

gcc/ChangeLog:

2019-06-12  Tejas Joshi  

* builtins.c (mathfn_built_in_2): Added CASE_MATHFN for ROUNDEVEN.
* builtins.def: Added function definitions for roundeven function
variants.
* fold-const-call.c (fold_const_call_ss): Added case for function
call and fold_const_conversion call for roundeven function.
* fold-const.c (negate_mathfn_p): Added case for roundeven function.
(tree_call_nonnegative_warnv_p): Added case for roundeven function.
(integer_valued_real_call_p): Added case for roundeven function.
* real.c (is_even): New function. Returns true if real number is
even, otherwise returns false.
(is_halfway_below): New function. Returns true if real number is
halfway between two integers, else return false.
(real_roundeven): New function. Round real number to nearest
integer, rounding halfway cases towards even.
* real.h (real_value): Added descriptive comments.
Added function declaration for roundeven function.

gcc/testsuite/ChangeLog:

2019-06-12  Tejas Joshi  

* gcc.dg/torture/builtin-round-roundeven.c: New test.
* gcc.dg/torture/builtin-round-roundevenf128.c: New test.
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3463ffb1539..85a945877a4 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2085,6 +2085,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
 CASE_MATHFN (REMQUO)
 CASE_MATHFN_FLOATN (RINT)
 CASE_MATHFN_FLOATN (ROUND)
+CASE_MATHFN (ROUNDEVEN)
 CASE_MATHFN (SCALB)
 CASE_MATHFN (SCALBLN)
 CASE_MATHFN (SCALBN)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 6d41bdb4f44..8bb7027aac7 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -542,12 +542,18 @@ DEF_C99_BUILTIN(BUILT_IN_RINTL, "rintl", BT_FN_LONGDOUBLE_LONGDOUBLE, AT
 #define RINT_TYPE(F) BT_FN_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_RINT, "rint", RINT_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef RINT_TYPE
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVEN, "roundeven", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENF, "roundevenf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENL, "roundevenl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUND, "round", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUNDF, "roundf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUNDL, "roundl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #define ROUND_TYPE(F) BT_FN_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_ROUND, "round", ROUND_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef ROUND_TYPE
+#define ROUNDEVEN_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_ROUNDEVEN, "roundeven", ROUNDEVEN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef ROUNDEVEN_TYPE
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SCALB, "scalb", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SCALBF, "scalbf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_SCALBL, "scalbl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index 702c8b4057a..d9b546e6803 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -836,6 +836,15 @@ fold_const_call_ss (real_value *result, combined_fn fn,
 	}
   return false;
 
+CASE_CFN_ROUNDEVEN:
+CASE_CFN_ROUNDEVEN_FN:
+  if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
+  {
+real_roundeven (result, format, arg);
+return true;
+  }
+  return false;
+
 CASE_CFN_LOGB:
   return fold_const_logb (result, arg, format);
 
@@ -898,6 +907,10 @@ fold_const_call_ss (wide_int *result, combined_fn fn,
   return fold_const_conversion (result, real_round, arg,
 precision, format);
 
+CASE_CFN_ROUNDEVEN:
+CASE_CFN_ROUNDEVEN_FN:
+  return fold_const_conversion (result, real_roundeven, arg, precision, format);
+
 CASE_CFN_IRINT:
 CASE_CFN_LRINT:
 CASE_CFN_LLRINT:
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 0ca472d422f..07d82a17e25 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -329,6 +329,8 @@ negate_mathfn_p (combined_fn fn)
 CASE_CFN_LLROUND:
 CASE_CFN_LROUND:
 CASE_CFN_ROUND:
+CASE_CFN_ROUNDEVEN:
+CASE_CFN_ROUNDEVEN_FN:
 CASE_CFN_SIN:
 CASE_CFN_SINH:
 CASE_CFN_TAN:
@@ -13063,6 +13065,8 @@ tree_call_nonnegative_warnv_p (tree type, combined_fn fn, tree arg0, tree arg1,
 CASE_CFN_RINT_FN:
 CASE_CFN_ROUND:
 CASE_CFN_ROUND_FN:
+CASE_CFN_ROUNDEVEN:
+CASE_CFN_ROUNDEVEN_FN:
 CASE_CFN_SCALB:
 CASE_CFN_SCALBLN:
 

Re: introduce EH_ELSE tree and gimplifier

2019-06-27 Thread Alexandre Oliva
On Jun 27, 2019, Richard Biener  wrote:

> On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva  wrote:

>> @@ -909,6 +909,13 @@ DEFTREECODE (TRY_CATCH_EXPR, "try_catch_expr", 
>> tcc_statement, 2)
>> The second operand is a cleanup expression which is evaluated
>> on any exit (normal, exception, or jump out) from this expression.  */
>> DEFTREECODE (TRY_FINALLY_EXPR, "try_finally", tcc_statement, 2)
>> +
>> +/* Evaluate either the normal or the exceptional cleanup.  This must
>> +   only be present as the cleanup expression in a TRY_FINALLY_EXPR.
>> +   If the TRY_FINALLY_EXPR completes normally, the first operand of
>> +   EH_ELSE is used as a cleanup, otherwise the second operand is
>> +   used.  */
>> +DEFTREECODE (EH_ELSE, "eh_else", tcc_statement, 2)

> It's a bit weird that this is a tcc_statement as opposed to tcc_expression,

Erhm...  What's weird about it?  It is not something that belongs in an
arbitrary expr, it's a lot more like a top-level statement, like
try_finally, having side effects but no useful value.

> also I'd have called it EH_ELSE_EXPR for clarity.

Now *that* would be weird IMHO.  No offense intended, but I'd have
dropped the _EXPR from TRY_FINALLY_EXPR to make it match the
"try_finally" string.

What reasons are there for them to deserve the _EXPR suffix?

If I rename it to EH_ELSE_EXPR, should GIMPLE_EH_ELSE also get a _EXPR
suffix?


> I also wonder why TRY_FINALLY_EXPR didn't just get a third optional
> operand?

I considered that possibility, but I didn't expect EH_ELSE to be used
often enough to bloat every TRY_FINALLY_EXPR with another operand, plus
a flag or some other means to remove ambiguity between same-cleanup and
no-else-cleanup.

-- 
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara


Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-27 Thread Hongtao Liu
On Thu, Jun 27, 2019 at 5:38 PM Rainer Orth  
wrote:
>
> Hi Hongtao,
>
> > On Thu, Jun 27, 2019 at 5:02 PM Rainer Orth  
> > wrote:
> >>
> >> Hi Hongtao,
> >>
> >> >> as usual, the new effective-target keyword needs documenting in
> >> >> sourcebuild.texi.
> >> > Like this?
> >>
> >> a couple of nits: first of all, your mailer seems to replace tabs by a
> >> single space.  Please fix this or attach the patch instead.
> >>
> >> > Index: ChangeLog
> >> > ===
> >> > --- ChangeLog (revision 272668)
> >> > +++ ChangeLog (working copy)
> >> > @@ -1,3 +1,8 @@
> >> > +2019-06-27  Hongtao Liu  
> >> > +
> >> > + * doc/sourcebuild.texi: Document new effective target keyword
> >> > + avx512vp2intersect.
> >>
> >> Please include the sections you're modifying, something like
> >>
> >> * doc/sourcebuild.texi (Effective-Target Keywords, Other
> >> hardware attributes): Document avx512vp2intersect.
> >>
> >> And please don't include the ChangeLog in the patch, but include it in
> >> the mail proper: it won't apply due to date and context changes anyway.
> >> Best review https://gcc.gnu.org/contribute.html where this is documented
> >> besides other points of patch submission.
> >>
> >> Besides, it's most likely useful to also review the GNU Coding
> >> Standards, too, not only for ChangeLog formatting.
> >>
> >> > Index: testsuite/ChangeLog
> >> > ===
> >> > --- testsuite/ChangeLog (revision 272668)
> >> > +++ testsuite/ChangeLog (working copy)
> >> > @@ -1,3 +1,11 @@
> >> > +2019-06-27  Hongtao Liu  
> >> > +
> >> > + * lib/target-supports.exp: Add
> >> > + check_effective_target_avx512vp2intersect.
> >>
> >> Use
> >>
> >> * lib/target-supports.exp
> >> (check_effective_target_avx512vp2intersect): New proc.
> >>
> >> > + * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Add
> >> > + dg-require-effective-target avx512vp2intersect.
> >>
> >> Better:
> >>
> >> * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Require
> >> avx512vp2intersect.
> >>
> >> but that's a matter of preference.
> >>
> >> > Index: testsuite/gcc.target/i386/avx512vp2intersect-2intersect-1b.c
> >> > ===
> >> > --- testsuite/gcc.target/i386/avx512vp2intersect-2intersect-1b.c
> >> > (revision 272668)
> >> > +++ testsuite/gcc.target/i386/avx512vp2intersect-2intersect-1b.c
> >> > (working copy)
> >> > @@ -1,5 +1,6 @@
> >> >  /* { dg-do run } */
> >> >  /* { dg-options "-O2 -mavx512vp2intersect" } */
> >> > +/* { dg-require-effective-target "avx512vp2intersect" } */
> >>
> >> No need to quote avx512vp2intersect here and in the next test.
> >>
> >> Ok with those nits fixed.
> >>
> > Yes, thanks a lot.
> >> Thanks.
> >> Rainer
> >>
> >> --
> >> -
> >> Rainer Orth, Center for Biotechnology, Bielefeld University
> >
> > Ok for trunk?
>
> ENOPATCH
Sorry, Here is the patch.
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University


Changelog

gcc/
+2019-06-27  Hongtao Liu  
+
+ * doc/sourcebuild.texi (Effective-Target Keywords, Other
+ hardware attributes): Document avx512vp2intersect.
+

gcc/testsuite/
+2019-06-27  Hongtao Liu  
+
+ * lib/target-supports.exp
+ (check_effective_target_avx512vp2intersect): New proc.
+ * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Add
+ dg-require-effective-target avx512vp2intersect.
+ * gcc.target/i386/avx512vp2intersect-2intersectvl-1b.c: Ditto.
+


-- 
BR,
Hongtao
Index: doc/sourcebuild.texi
===
--- doc/sourcebuild.texi	(revision 272667)
+++ doc/sourcebuild.texi	(working copy)
@@ -2046,6 +2046,9 @@
 @item avx512f_runtime
 Target supports the execution of @code{avx512f} instructions.
 
+@item avx512vp2intersect
+Target supports the execution of @code{avx512vp2intersect} instructions.
+
 @item cell_hw
 Test system can execute AltiVec and Cell PPU instructions.
 
Index: testsuite/lib/target-supports.exp
===
--- testsuite/lib/target-supports.exp	(revision 272667)
+++ testsuite/lib/target-supports.exp	(working copy)
@@ -7963,6 +7963,20 @@
 } "-mavx512bw" ]
 }
 
+# Return 1 if avx512vp2intersect instructions can be compiled.
+proc check_effective_target_avx512vp2intersect { } {
+return [check_no_compiler_messages avx512vp2intersect object {
+	typedef int __v16si __attribute__ ((__vector_size__ (64)));
+	typedef short __mmask16;
+	void
+	_mm512_2intersect_epi32 (__v16si __A, __v16si __B, __mmask16 *__U,
+ __mmask16 *__M)
+	{
+	__builtin_ia32_2intersectd512 (__U, __M, (__v16si) __A, (__v16si) __B);
+	}
+} "-mavx512vp2intersect" ]
+}
+
 # Return 1 if avx512ifma 

Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Jeff Law
On 6/27/19 10:12 AM, Jakub Jelinek wrote:
> On Thu, Jun 27, 2019 at 09:15:41AM -0600, Jeff Law wrote:
>> Actually it was trivial to create with store merging.
>>
>> char x[20];
>> foo()
>> {
>>   x[0] = 0x41;
>>   x[1] = 0x42;
>> }
>>
>>   MEM  [(char *)] = 16961;
>>
>> So clearly we can get this in gimple.  So we need a check of some kind,
>> either on the LHS or the RHS to ensure that we actually have a single
>> character store as opposed to something like the above.
> 
> handle_char_store is only called if:
>   tree type = TREE_TYPE (lhs);
>   if (TREE_CODE (type) == ARRAY_TYPE)
> type = TREE_TYPE (type);
>   if (TREE_CODE (type) == INTEGER_TYPE
>   && TYPE_MODE (type) == TYPE_MODE (char_type_node)
>   && TYPE_PRECISION (type) == TYPE_PRECISION (char_type_node))
> {
>   if (! handle_char_store (gsi))
> return false;
> }
> so the above case will not trigger, the only way to call it for multiple
> character stores is if you have an array.  And so far I've succeeded
> creating that just with strcpy builtin.
How I could have missed it so many times is a mystery to me.  I
literally was looking at this on the phone with Martin today for the nth
time and my brain just wouldn't parse this code correctly.

We can only get into handle_char_store for an LHS that is a character or
array of characters.  So the only case we have to worry about would be
if we have a constant array on the RHS.  Something like this from Jakub:

 MEM  [(char *)] = MEM  [(char *)"ab"];

But in this case the RHS is going to be an ARRAY_TYPE.  And thus
Martin's new code would reject it because it would only do the
optmiization if the RHS has INTEGER_TYPE.

So I think my concerns are ultimately a non-issue.

FWIW if you're trying to get something like Jakub's assignment, this
will do it:

  char a[7];
  int f ()
  {
__builtin_strcpy (a, "654321");
  }

Which results in this being passed to handle_char_store:

  MEM  [(char * {ref-all})] = MEM 
[(char * {ref-all})"654321"];

We can make it more problematical by first doing a strcpy into a so that
we create an SI structure.  Something like this:

 char a[7];
  int f ()
  {
strcpy (a, "123");
__builtin_strcpy (a, "654321");
  }

Compiled with -O2 -fno-tree-dse (the -fno-tree-dse ensures the first
strcpy is not eliminated).

That should get us into handle_char_store and into the new code from
Martin's patch that wants to test:



+  if (cmp > 0
+ && storing_nonzero_p
+ && TREE_CODE (TREE_TYPE (rhs)) == INTEGER_TYPE)

But the RHS in this case has an ARRAY_TYPE and Martin's test would
reject it.


So at this point my concerns are alleviated.

Jakub, Richi, do either of you have concerns?  I've kindof owned the
review of this patch from Martin, but since y'all have chimed in, I'd
like to give you another chance to chime in before I ACK.

jeff

ps.  FWIW, the gimple FE seems to really dislike constant strings in MEM
expressions like we've been working with.  It also seems to go into an
infinite loop if you've got an extra closing paren on the RHS..  It was
still useful to poke at things a bit.  One could argue that if we get
the FE to a suitable state that it could define what is and what is not
valid gimple.  That would likely be incredibly help for this kind of stuff.


[PATCH] PowerPC Prefixed Memory, Patch #3, Update predicates

2019-06-27 Thread Michael Meissner
This patch updates the predicates for prefixed addressing.

This patch deletes a predicate that I had originally added, but the code no
longer uses.

This patch changes how local symbols for pc-relative addressing are marked.
Previously, we had used a machine dependent bit in the SYMBOL_REF node.  Now, I
use the SYMBOL_REF_LOCAL_P bit.

This patch adds a predicate for handling external addresses that will be used
for future patches.

I have done bootstrap builds and make check.  There were no regressions.  Can I
check this patch into the trunk?

2019-06-27  Michael Meissner  

* config/rs6000/predicates.md (pcrel_address):  Use
SYMBOL_REF_LOCAL_P to determine if a label is local.
(pcrel_external_address): New predicate.
(non_prefixed_mem_operand): Delete, predicate not used.
* config/rs6000/rs6000.h (SYMBOL_FLAG_PCREL_P): Delete, we now use
SYMBOL_REF_LOCAL_P to determine if we can use pc-relative
addressing.
(SYMBOL_REF_PCREL_P): Likewise.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 272766)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -1645,21 +1645,57 @@ (define_predicate "pcrel_address"
   op = op0;
 }
 
-  return LABEL_REF_P (op) || SYMBOL_REF_PCREL_P (op);
+  if (LABEL_REF_P (op))
+return true;
+
+  return (TARGET_CMODEL == CMODEL_MEDIUM && SYMBOL_REF_P (op)
+ && SYMBOL_REF_LOCAL_P (op));
 })
 
-;; Return 1 if op is a prefixed memory operand
+;; Return true if the operand is an external symbol whose address can be loaded
+;; into a register either via PADDI (if the symbol is in the main program) or
+;; PLD GOT address (if the symbol is defined in a shared library).
+
+(define_predicate "pcrel_external_address"
+  (match_code "symbol_ref,const")
+{
+  if (!TARGET_PCREL || TARGET_CMODEL != CMODEL_MEDIUM)
+return false;
+
+  /* Discard any CONST's.  */
+  if (GET_CODE (op) == CONST)
+op = XEXP (op, 0);
+
+  /* Validate offset.  */
+  if (GET_CODE (op) == PLUS)
+{
+  rtx op0 = XEXP (op, 0);
+  rtx op1 = XEXP (op, 1);
+
+  if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
+   return false;
+
+  op = op0;
+}
+
+  return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
+})
+
+;; Return 1 if op is a prefixed memory operand.
 (define_predicate "prefixed_mem_operand"
   (match_code "mem")
 {
   return rs6000_prefixed_address (XEXP (op, 0), GET_MODE (op));
 })
 
-;; Return 1 if op is a memory operand that is not a prefixed memory
-;; operand.
-(define_predicate "non_prefixed_mem_operand"
-  (and (match_operand 0 "memory_operand")
-   (not (match_operand 0 "prefixed_mem_operand"
+;; Return 1 if op is a memory operand to an external variable when we
+;; support pc-relative addressing and the PCREL_OPT relocation to
+;; optimize references to it.
+(define_predicate "pcrel_external_mem_operand"
+  (match_code "mem")
+{
+  return pcrel_external_address (XEXP (op, 0), Pmode);
+})
 
 ;; Match the first insn (addis) in fusing the combination of addis and loads to
 ;; GPR registers on power8.
Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 272766)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -2550,10 +2550,3 @@ typedef struct GTY(()) machine_function
   IN_RANGE ((VALUE),   \
-(HOST_WIDE_INT_1 << 33),   \
(HOST_WIDE_INT_1 << 33) - 1 - (EXTRA))
-
-/* Flag to mark SYMBOL_REF objects to say they are local addresses and are used
-   in pc-relative addresses.  */
-#define SYMBOL_FLAG_PCREL  SYMBOL_FLAG_MACH_DEP
-
-#define SYMBOL_REF_PCREL_P(X)  \
-  (SYMBOL_REF_P (X) && SYMBOL_REF_FLAGS (X) & SYMBOL_FLAG_PCREL)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Add support for PowerPC prefixed memory instructions and pc-relative support (Intro)

2019-06-27 Thread Michael Meissner
To keep things organized, I'm going to start submitting the patches for for a
possible future PowerPC machine's prefixed addressing (including pc-relative
suport) as threads under this message.

There are two patches that I've already submitted that are needed for the rest
of the patches:

Patch #1
PR Fix floatsi{sf,df}2 and floatunssi{sf,df}2 for a future powerpc machine
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01791.html

Patch #2
Prepare for prefixed instructions on PowerPC
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01812.html

I am trying to break the patches into smaller related pieces, instead of just
submitting a combination patch.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] Add --disable-tm-clone-registry libgcc configure option.

2019-06-27 Thread Jim Wilson
On Thu, Jun 20, 2019 at 12:50 PM Jim Wilson  wrote:
> This looks OK to me.  It is worth pointing out that ARM already ships
> compilers built this way, but they didn't bother adding a configure
> option.  They just override Makefile variables in their build scripts.
> I think this is much cleaner as a documented configure option.

It has been a week, and no one else commented on this, so I went ahead
and committed it, with the regenerated configure.

Jim


Re: [PATCH 1/3] C++20 constexpr lib part 1/3

2019-06-27 Thread Jonathan Wakely

On 27/06/19 19:07 -0400, Ed Smith-Rowland wrote:

On 6/27/19 1:06 PM, Ville Voutilainen wrote:

On Thu, 27 Jun 2019 at 19:55, Ed Smith-Rowland via libstdc++
 wrote:

I don't think this will work in a constant expression:

?? /// Assign @p __new_val to @p __obj and return its previous value.
?? template 
+?? _GLIBCXX20_CONSTEXPR
?? inline _Tp
?? exchange(_Tp& __obj, _Up&& __new_val)
?? { return std::__exchange(__obj, std::forward<_Up>(__new_val)); }

Because std::__exchange hasn't been marked constexpr. The test passes
because it doesn't call it in a context that requires constant
evaluation:

??const auto x = std::exchange(e, pi);

Derp.

I see the same problem in other tests too:

+?? constexpr std::array car{{0, 1, 2, 3, 4, 5, 6, 6, 8, 9, 9,
11}};
+
+?? const auto out0x = std::adjacent_find(car.begin(), car.end());
+
+?? const auto out1x = std::adjacent_find(car.begin(), car.end(),
+?? std::equal_to())

I will go through all the tests and make sure that some nontrivial
calculation is done that contributes to a final bool return.?? And clean
up the mess.?? I did this in chunk 2 but I guess I left a lot of chunk 1.

As was the case with the iterator patch, it's not a question of doing
a nontrivial calculation, but
a question of initializing a constexpr variable with the result. (Yeah
sure a non-type template
argument would also work but eurgh). Then, and only then, have we
proven that the code
works in a constexpr context. Initializing a const doesn't do that.
constinit would, but that's
C++20.


Ok, why isn't

?? static_assert(test());

a constexpr context?


It is, I missed that the array test does that at the bottom of the
file.


The std::exchange test passed originally because, unlike all the other 
algo tests I had neglected to call the test function in a constexpr 
context.


Note that constexpr_iter.c is this:



constexpr int
test()
{
?? constexpr std::array a1{{1, 2, 3}};
?? static_assert(1 == *a1.begin());
?? auto n = a1[0] * a1[1]* a1[2];
?? static_assert(1 == *a1.cbegin());

?? std::array a2{{0, 0, 0}};
?? auto a1i = a1.begin();
?? auto a1e = a1.end();
?? auto a2i = a2.begin();
?? while (a1i != a1e)
?? *a2i++ = *a1i++;

?? return n;
}

void
run_test()
{
?? constexpr int n = test();
}



Things inside the function can, and in many cases for this capability 
must, be mutable.?? As long as the input and the final output is a 
literal we should be good, no?


Also when I assign returned iterators to constexpr I get:

/home/ed/gcc_git/libstdc++-v3/testsuite/25_algorithms/find_if_not/constexpr.cc:36: 
error: '(((std::array::const_pointer)(& ca0.std::array12>::_M_elems)) + 28)' is not a constant expression.


Yes, I tried adding that and got the same error, which is why I
thought the test had the same problem as the std::exchange one.

I'm not sure what causes that error, I'll take a look tomorrow.




Re: [PATCH 1/3] C++20 constexpr lib part 1/3

2019-06-27 Thread Ed Smith-Rowland via gcc-patches

On 6/27/19 1:06 PM, Ville Voutilainen wrote:

On Thu, 27 Jun 2019 at 19:55, Ed Smith-Rowland via libstdc++
 wrote:

I don't think this will work in a constant expression:

?? /// Assign @p __new_val to @p __obj and return its previous value.
?? template 
+?? _GLIBCXX20_CONSTEXPR
?? inline _Tp
?? exchange(_Tp& __obj, _Up&& __new_val)
?? { return std::__exchange(__obj, std::forward<_Up>(__new_val)); }

Because std::__exchange hasn't been marked constexpr. The test passes
because it doesn't call it in a context that requires constant
evaluation:

??const auto x = std::exchange(e, pi);

Derp.

I see the same problem in other tests too:

+?? constexpr std::array car{{0, 1, 2, 3, 4, 5, 6, 6, 8, 9, 9,
11}};
+
+?? const auto out0x = std::adjacent_find(car.begin(), car.end());
+
+?? const auto out1x = std::adjacent_find(car.begin(), car.end(),
+?? std::equal_to())

I will go through all the tests and make sure that some nontrivial
calculation is done that contributes to a final bool return.?? And clean
up the mess.?? I did this in chunk 2 but I guess I left a lot of chunk 1.

As was the case with the iterator patch, it's not a question of doing
a nontrivial calculation, but
a question of initializing a constexpr variable with the result. (Yeah
sure a non-type template
argument would also work but eurgh). Then, and only then, have we
proven that the code
works in a constexpr context. Initializing a const doesn't do that.
constinit would, but that's
C++20.


Ok, why isn't

?? static_assert(test());

a constexpr context?

The std::exchange test passed originally because, unlike all the other 
algo tests I had neglected to call the test function in a constexpr context.


Note that constexpr_iter.c is this:



constexpr int
test()
{
?? constexpr std::array a1{{1, 2, 3}};
?? static_assert(1 == *a1.begin());
?? auto n = a1[0] * a1[1]* a1[2];
?? static_assert(1 == *a1.cbegin());

?? std::array a2{{0, 0, 0}};
?? auto a1i = a1.begin();
?? auto a1e = a1.end();
?? auto a2i = a2.begin();
?? while (a1i != a1e)
?? *a2i++ = *a1i++;

?? return n;
}

void
run_test()
{
?? constexpr int n = test();
}



Things inside the function can, and in many cases for this capability 
must, be mutable.?? As long as the input and the final output is a 
literal we should be good, no?


Also when I assign returned iterators to constexpr I get:

/home/ed/gcc_git/libstdc++-v3/testsuite/25_algorithms/find_if_not/constexpr.cc:36: 
error: '(((std::array::const_pointer)(& ca0.std::array12>::_M_elems)) + 28)' is not a constant expression.




Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Jeff Law
On 6/27/19 12:40 PM, Richard Biener wrote:
> On June 27, 2019 7:04:32 PM GMT+02:00, Jakub Jelinek  wrote:
>> On Thu, Jun 27, 2019 at 10:58:25AM -0600, Martin Sebor wrote:
>>> The LHS is unsigned short so handle_char_store would not be called
>>> because of the check in the caller.  You would need something like:
>>>
>>>   MEM  [(char *)] = { 'a', 'b' };
>>
>> This is invalid, because the rhs is non-empty CONSTRUCTOR that doesn't
>> have
>> VECTOR_TYPE - verify_gimple_assign_single has:
>>case CONSTRUCTOR:
>>  if (TREE_CODE (rhs1_type) == VECTOR_TYPE)
>> ...
>>  else if (CONSTRUCTOR_NELTS (rhs1) != 0)
>>{
>>  error ("non-vector %qs with elements", code_name);
>>  debug_generic_stmt (rhs1);
>>  return true;
>>}
>>
>> But
>>  MEM  [(char *)] = MEM  [(char *)"ab"];
>> is valid.  Or = {} would be valid too, even for array stores.
> 
> And you can of course write GIMPLE unit tests for the pass using the GIMPLE 
> FE. 
Yea.  And this may ultimately be the way we should think about testing
this kind of stuff where GIMPLE might allow something that is impossible
to express in C or C++.

Realistically I don't think any of us can be expected to know the dusty
corners of every language.  But we ought to be able to reason about
what's valid gimple and write tests using the gimple front-end to verify
how a particular construct would be handled.

jeff


[committed] Fix -Wimplicit-fallthrough with [[{,un}likely]] attributes on labels (PR c++/91024)

2019-06-27 Thread Jakub Jelinek
Hi!

The likely/unlikely C++11 attributes on case labels result in GIMPLE_PREDICT
statements inserted after the label; we should just ignore such statements,
they aren't something executable in between the labels.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk,
queued for backporting to 9.2.

2019-06-27  Jakub Jelinek  

PR c++/91024
* gimplify.c (collect_fallthrough_labels): Ignore GIMPLE_PREDICT
statements.

* g++.dg/warn/Wimplicit-fallthrough-4.C: New test.

--- gcc/gimplify.c.jj   2019-06-15 09:08:03.383953216 +0200
+++ gcc/gimplify.c  2019-06-27 20:03:31.943295216 +0200
@@ -2120,6 +2120,8 @@ collect_fallthrough_labels (gimple_stmt_
}
   else if (gimple_call_internal_p (gsi_stmt (*gsi_p), IFN_ASAN_MARK))
;
+  else if (gimple_code (gsi_stmt (*gsi_p)) == GIMPLE_PREDICT)
+   ;
   else if (!is_gimple_debug (gsi_stmt (*gsi_p)))
prev = gsi_stmt (*gsi_p);
   gsi_next (gsi_p);
--- gcc/testsuite/g++.dg/warn/Wimplicit-fallthrough-4.C.jj  2019-06-27 
20:13:08.557572047 +0200
+++ gcc/testsuite/g++.dg/warn/Wimplicit-fallthrough-4.C 2019-06-27 
20:14:12.399606229 +0200
@@ -0,0 +1,22 @@
+// PR c++/91024
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wimplicit-fallthrough" }
+
+int
+foo (char c)
+{
+  int result = 0;
+
+  switch (c)
+{
+case 'O':
+case 'K':
+  return result;
+[[unlikely]] case 'X': // { dg-bogus "this statement may fall through" 
}
+case 'x':  // { dg-bogus "here" }
+  return result;
+default:
+  break;
+}
+  return result;
+}

Jakub


[committed] Fix ICE in scan_operand_equal_p (PR tree-optimization/91010)

2019-06-27 Thread Jakub Jelinek
Hi!

As the testcase shows, offset{1,2} can be NULL and operand_equal_p doesn't
like NULL arguments.  If both are NULL, we should return true, if just one,
we should return false.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-06-27  Jakub Jelinek  

PR tree-optimization/91010
* tree-vect-stmts.c (scan_operand_equal_p): If offset1 == offset2,
return true.  Otherwise, don't call operand_equal_p if offset1 or
offset2 is NULL and just return false.

* g++.dg/vect/simd-10.cc: New test.

--- gcc/tree-vect-stmts.c.jj2019-06-25 08:58:02.019417742 +0200
+++ gcc/tree-vect-stmts.c   2019-06-27 10:20:09.658616378 +0200
@@ -6347,7 +6347,10 @@ scan_operand_equal_p (tree ref1, tree re
 return false;
   if (maybe_ne (bitsize1, bitsize2))
 return false;
-  if (!operand_equal_p (offset1, offset2, 0))
+  if (offset1 != offset2
+  && (!offset1
+ || !offset2
+ || !operand_equal_p (offset1, offset2, 0)))
 return false;
   return true;
 }
--- gcc/testsuite/g++.dg/vect/simd-10.cc.jj 2019-06-27 10:12:44.828471845 
+0200
+++ gcc/testsuite/g++.dg/vect/simd-10.cc2019-06-27 10:24:52.152262921 
+0200
@@ -0,0 +1,8 @@
+// PR tree-optimization/91010
+// { dg-do compile }
+// { dg-require-effective-target size32plus }
+// { dg-additional-options "-fopenmp-simd -fno-tree-forwprop" }
+// { dg-additional-options "-mavx" { target avx_runtime } }
+// { dg-final { scan-tree-dump-times "vectorized \[1-3] loops" 2 "vect" { 
target i?86-*-* x86_64-*-* } } }
+
+#include "simd-5.cc"

Jakub


Re: [PATCH] don't trim empty string initializers for pointers (PR 90947)

2019-06-27 Thread Jason Merrill

On 6/23/19 5:51 PM, Martin Sebor wrote:

On 6/22/19 9:37 PM, Jason Merrill wrote:

On 6/21/19 8:05 PM, Martin Sebor wrote:

The solution we implemented in GCC 9 to get the mangling of
non-type template arguments of class types containing array
members consistent regardless of the form of their
initialization introduced a couple of bugs.  One of these
is the subject of this patch.  The bug results in stripping
trailing initializers for array elements that involve the empty
string such as in here:

   void f (void)
   {
 const char* a[1][1] = { "" };
 if (!a[0][0])
   __builtin_abort ();
   }

The problem is caused by relying on initializer_zerop() that
returns true for the empty string regardless of whether it's
used to initialize an array or a pointer (the function doesn't
know what the initializer is being used for).


Why doesn't the existing POINTER_TYPE_P check handle this?  It's 
clearly intended to.


It does but only only for initializers of "leaf" elements/members.
The function processes initializers of enclosing elements or members
recursively.  When initializer_zerop returns true for one of those,
it doesn't differentiate between an empty string that's being used
to initialize a leaf array or a leaf pointer.

For the example above, the first time through, elt_type is char*
and elt_init is the empty string, or the array char[1], and so
the initializer is retained.

The second time through (after the leaf recursive call returns),
elt_init is { "" } (i.e., an array of one empty string), elt_type
is also an array, and initializer_zerop(elt_init) returns true, so
the initializer is dropped.

I'm actually surprised this initializer_zerop ambiguity between
arrays and strings doesn't come up elsewhere.  That's also why
I added the new function to tree.c.


Makes sense.


Btw., it belatedly occurred to me that the new function doesn't
correctly handled unnamed bit-fields so I've corrected it to
handle those as well.  Attached is the revised patch with this
change and a test to exercise it.



+|| (DECL_BIT_FIELD (fld) && !!DECL_NAME (fld)))


Hmm, this looks like it would skip named bit-fields rather than unnamed.


PS I found DECL_UNNAMED_BIT_FIELD in c-family/c-common.h.
I used (DECL_BIT_FIELD (fld) && !DECL_NAME (fld)) because
the former macro isn't available in tree.c.  Does that mean
that that would make the new function not completely correct
in some other languages?


I don't know if other languages treat unnamed bit-fields as initializable.


+/* Analogous to initializer_zerop but examines the type for which
+   the initializer is being used but unlike it, considers empty
+   strings to be zero initializers for arrays and non-zero for
+   pointers.  */


This comment could use more editing.

Jason


Re: [C++ Patch] PR 90909 ("[10 Regression] call devirtualized to pure virtual")

2019-06-27 Thread Jason Merrill

On 6/24/19 4:52 AM, Paolo Carlini wrote:

Hi,

On 23/06/19 19:45, Jason Merrill wrote:

On 6/23/19 7:53 AM, Paolo Carlini wrote:

... hi again ;)

The other day I was having a look at using declarations for this 
issue and noticed that only a few lines below the de-virtualization 
check we have to handle functions found by a using declaration, for 
various reasons. In particular, we know whether we found a function 
fn where has been declared or in a derived class. Thus the idea: for 
the purpose of making some progress, in particular all the cases in 
c++/67184 & co, would it make sense for the time being to simply add 
a check to the de-virtualization condition restricting it to 
non-using declarations? See the below (it also moves the conditional 
a few lines below only for clarity and consistency with the code 
handling using declarations, no functional impact) What do you think?


Hmm, perhaps we should check CLASSTYPE_FINAL in 
resolves_to_fixed_type_p rather than in build_over_call at all; then 
the code in build_new_method_call ought to set LOOKUP_NONVIRTUAL when 
appropriate.


I think your suggestion has to do with the initial implementation of 
this optimization, as contributed by my friend Roberto Agostino: we had 
the issue that it didn't handle at all user-defined operators and 
Vincenzo filed c++/53186. Thus, upon your suggestion, we moved the code 
to build_over_call, the current place:


     https://gcc.gnu.org/ml/gcc-patches/2012-05/msg00246.html

where it catched both member functions and operators. Now - before we 
get to the details - if I move the CLASSTYPE_FINAL check to 
resolves_to_fixed_type_p we exactly regress on c++/53186, that is 
other/final2.C, because resolves_to_fixed_type_p is *never* called. The 
pending final4.C, also involving operators (I constructed it exactly 
because I knew operators could be tricky) is also not fixed, but in that 
case at least resolves_to_fixed_type_p *is* called, only, too late (I 
think, I more details later, if you like).


Ah, thanks.  Then perhaps we want to change the CLASSTYPE_FINAL in 
build_over_call to resolves_to_fixed_type_p (arg), to also handle the 
other reasons we might know the dynamic type of the argument, and remove 
the related code from build_new_method_call_1?


You could avoid needing to move the conditional lower by comparing 
DECL_CONTEXT (fn) and BINFO_TYPE (cand->conversion_path) rather than 
parmtype and TREE_TYPE (converted_arg).


Jason


[C++ PATCH] PR c++/55442 - memory-hog with highly recursive constexpr.

2019-06-27 Thread Jason Merrill
This testcase in the PR is extremely recursive, and therefore uses a huge
amount of memory on caching the results of individual calls.  We no longer
need to track all calls to catch infinite recursion, as we have other limits
on maximum depth and operations count.  So let's only cache a few calls at
the top level: 8 seems to be a reasonable compromise.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/c-family/
* c.opt (fconstexpr-loop-limit): New.
gcc/cp/
* constexpr.c (push_cx_call_context): Return depth.
(cxx_eval_call_expression): Don't cache past constexpr_cache_depth.
---
 gcc/doc/invoke.texi| 16 ++--
 gcc/c-family/c.opt |  4 
 gcc/cp/constexpr.c | 29 ++---
 gcc/c-family/ChangeLog |  5 +
 gcc/cp/ChangeLog   |  6 ++
 5 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 03d2895c9d7..664c38f78af 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -209,8 +209,9 @@ in the following sections.
 @xref{C++ Dialect Options,,Options Controlling C++ Dialect}.
 @gccoptlist{-fabi-version=@var{n}  -fno-access-control @gol
 -faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new @gol
--fconstexpr-depth=@var{n}  -fconstexpr-loop-limit=@var{n} @gol
--fconstexpr-ops-limit=@var{n} -fno-elide-constructors @gol
+-fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n} @gol
+-fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n} @gol
+-fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
 -fno-gnu-keywords @gol
 -fno-implicit-templates @gol
@@ -2527,6 +2528,17 @@ to @var{n}.  A limit is needed to detect endless 
recursion during
 constant expression evaluation.  The minimum specified by the standard
 is 512.
 
+@item -fconstexpr-cache-depth=@var{n}
+@opindex fconstexpr-cache-depth
+Set the maximum level of nested evaluation depth for C++11 constexpr
+functions that will be cached to @var{n}.  This is a heuristic that
+trades off compilation speed (when the cache avoids repeated
+calculations) against memory consumption (when the cache grows very
+large from highly recursive evaluations).  The default is 8.  Very few
+users are likely to want to adjust it, but if your code does heavy
+constexpr calculations you might want to experiment to find which
+value works best for you.
+
 @item -fconstexpr-loop-limit=@var{n}
 @opindex fconstexpr-loop-limit
 Set the maximum number of iterations for a loop in C++14 constexpr functions
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index a4cf3bd623d..080066fa608 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1424,6 +1424,10 @@ fconstexpr-depth=
 C++ ObjC++ Joined RejectNegative UInteger Var(max_constexpr_depth) Init(512)
 -fconstexpr-depth= Specify maximum constexpr recursion depth.
 
+fconstexpr-cache-depth=
+C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_cache_depth) Init(8)
+-fconstexpr-cache-depth=   Specify maximum constexpr recursion 
cache depth.
+
 fconstexpr-loop-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_loop_limit) 
Init(262144)
 -fconstexpr-loop-limit=Specify maximum constexpr loop 
iteration count.
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index a3a36d09d5e..d11e7af3eb1 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1447,16 +1447,17 @@ static vec call_stack;
 static int call_stack_tick;
 static int last_cx_error_tick;
 
-static bool
+static int
 push_cx_call_context (tree call)
 {
   ++call_stack_tick;
   if (!EXPR_HAS_LOCATION (call))
 SET_EXPR_LOCATION (call, input_location);
   call_stack.safe_push (call);
-  if (call_stack.length () > (unsigned) max_constexpr_depth)
+  int len = call_stack.length ();
+  if (len > max_constexpr_depth)
 return false;
-  return true;
+  return len;
 }
 
 static void
@@ -1587,7 +1588,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
   tree fun = get_function_named_in_call (t);
   constexpr_call new_call
 = { NULL, NULL, NULL, 0, ctx->manifestly_const_eval };
-  bool depth_ok;
+  int depth_ok;
 
   if (fun == NULL_TREE)
 return cxx_eval_internal_function (ctx, t, lval,
@@ -1791,14 +1792,20 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
tree t,
   entry = *slot;
   if (entry == NULL)
{
- /* We need to keep a pointer to the entry, not just the slot, as the
-slot can move in the call to cxx_eval_builtin_function_call.  */
- *slot = entry = ggc_alloc ();
- *entry = new_call;
- fb.preserve ();
+ /* Only cache up to constexpr_cache_depth to limit memory use.  */
+ if (depth_ok < constexpr_cache_depth)
+   {
+ /* We need to keep a pointer to the entry, not just the slot, as
+the slot can move during evaluation of the body.  */
+ *slot = entry = ggc_alloc ();
+ *entry = new_call;
+ 

[PATCH, obvious] gdbhooks.py: rename parameters to match usage

2019-06-27 Thread Vladislav Ivanishin
Hi,

The parameter names here were in the wrong order (which doesn't matter
to the interpreter, but confuses the humans reading the calling code).

I am going to install the following patch soon.

gcc/ChangeLog:

* gdbhooks.py (GdbPrettyPrinters.add_printer_for_types): Reorder
parameter names in accord with usage (no functional change).
(GdbPrettyPrinters.add_printer_for_regex): Ditto.
---
 gcc/gdbhooks.py | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index 7b1a7be0002..15a5ceaa56f 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -521,11 +521,11 @@ class GdbPrettyPrinters(gdb.printing.PrettyPrinter):
 def __init__(self, name):
 super(GdbPrettyPrinters, self).__init__(name, [])
 
-def add_printer_for_types(self, name, class_, types):
-self.subprinters.append(GdbSubprinterTypeList(name, class_, types))
+def add_printer_for_types(self, types, name, class_):
+self.subprinters.append(GdbSubprinterTypeList(types, name, class_))
 
-def add_printer_for_regex(self, name, class_, regex):
-self.subprinters.append(GdbSubprinterRegex(name, class_, regex))
+def add_printer_for_regex(self, regex, name, class_):
+self.subprinters.append(GdbSubprinterRegex(regex, name, class_))
 
 def __call__(self, gdbval):
 type_ = gdbval.type.unqualified()
-- 
2.22.0


-- 
Vlad


[PATCH] Prepare for prefixed instructions on PowerPC

2019-06-27 Thread Michael Meissner
A future PowerPC machine may have prefixed instructions.  This patch changes
the RTL "length" attribute of all of the "mov*", and "*extend*" insns from an
explicit "4" to "*".  This change prepares for the length attribute to be set
appropriately for single instruction loads, stores, and add immediates that may
include prefixed instructions.  The idea is to do this patch now, rather than
having these lines be part of the larger patches.

As we discussed off-line earlier, I changed all of the "4" lengths to be "*",
even for instruction alternatives that would not be subject to being changed to
be a prefixed instruction (i.e. the sign extend instruction "extsw"'s length is
set to "*", even though it would not be a prefixed instruction).  I also
changed the Altivec load/store instructions, as well as a power8 permute
instruction just be consistant with the other changes.

This patch does not fix the places where the length is not "4".  Those lengths
will be updated as needed as the rest of the prefixed instruction support is
rolled out.  It is intended to be a simple patch to make the other patches
somewhat simpler.

I did a bootstrap and make check.  There were no regressions.  Can I check this
patch into the trunk?

2019-06-27  Michael Meissner  

* config/rs6000/altivec.md (altivec_mov, VM2 iterator):
Change the RTL attribute "length" from "4" to "*" to allow the
length attribute to be adjusted automatically for prefixed load,
store, and add immediate instructions.
* config/rs6000/rs6000.md (extendhi2, EXTHI iterator):
Likewise.
(extendsi2, EXTSI iterator): Likewise.
(movsi_internal1): Likewise.
(movsi_from_sf): Likewise.
(movdi_from_sf_zero_ext): Likewise.
(mov_internal): Likewise.
(movcc_internal1, QHI iterator): Likewise.
(mov_softfloat, FMOVE32 iterator): Likewise.
(movsf_from_si): Likewise.
(mov_hardfloat32, FMOVE64 iterator): Likewise.
(mov_softfloat64, FMOVE64 iterator): Likewise.
(mov, FMOVE128 iterator): Likewise.
(movdi_internal64): Likewise.
* config/rs6000/vsx.md (vsx_le_permute_, VSX_TI iterator):
Likewise.
(vsx_le_undo_permute_, VSX_TI iterator): Likewise.
(vsx_mov_64bit, VSX_M iterator): Likewise.
(vsx_mov_32bit, VSX_M iterator): Likewise.
(vsx_splat_v4sf): Likewise.

Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md(revision 272714)
+++ gcc/config/rs6000/altivec.md(working copy)
@@ -256,7 +256,7 @@ (define_insn "*altivec_mov"
* return output_vec_const_move (operands);
#"
   [(set_attr "type" "vecstore,vecload,veclogical,store,load,*,veclogical,*,*")
-   (set_attr "length" "4,4,4,20,20,20,4,8,32")])
+   (set_attr "length" "*,*,*,20,20,20,*,8,32")])
 
 ;; Unlike other altivec moves, allow the GPRs, since a normal use of TImode
 ;; is for unions.  However for plain data movement, slightly favor the vector
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 272719)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -965,7 +965,7 @@ (define_insn "*extendhi2"
vextsh2d %0,%1"
   [(set_attr "type" "load,exts,fpload,vecperm")
(set_attr "sign_extend" "yes")
-   (set_attr "length" "4,4,8,4")
+   (set_attr "length" "*,*,8,*")
(set_attr "isa" "*,*,p9v,p9v")])
 
 (define_split
@@ -1040,7 +1040,7 @@ (define_insn "extendsi2"
#"
   [(set_attr "type" "load,exts,fpload,fpload,mffgpr,vecexts,vecperm,mftgpr")
(set_attr "sign_extend" "yes")
-   (set_attr "length" "4,4,4,4,4,4,8,8")
+   (set_attr "length" "*,*,*,*,*,*,8,8")
(set_attr "isa" "*,*,p6,p8v,p8v,p9v,p8v,p8v")])
 
 (define_split
@@ -6915,11 +6915,11 @@ (define_insn "*movsi_internal1"
 veclogical, veclogical,  vecsimple,   mffgpr,  mftgpr,
 *,  *,   *")
(set_attr "length"
-   "4,  4,   4,   4,   4,
-4,  4,   4,   4,   4,
-8,  4,   4,   4,   4,
-4,  4,   8,   4,   4,
-4,  4,   4")
+   "*,  *,   *,   *,   *,
+*,  *,   *,   *,   *,
+8,  *,   *,   *,   *,
+*,  *,   8,   *,   *,
+*,  *,   *")
(set_attr "isa"
"*,  *,   *,   p8v, p8v,
 *,  p8v, p8v, *,   *,
@@ -6995,9 +6995,9 @@ (define_insn_and_split "movsi_from_sf"
 fpstore,fpstore, 

RFC: GCC Aarch64 SIMD vectorization question involving libmvec

2019-06-27 Thread Steve Ellcey
I am testing the latest GCC with not-yet-submitted GLIBC changes that
implement libmvec on Aarch64.

While trying to run SPEC 2017 (specifically 521.wrf_r) I ran into a
case where GCC was generating a call to _ZGVnN2vv_powf, that is a
vectorized powf call for 2 (not 4) elements.  This was a problem
because I only implemented a 4 element 32 bit vectorized powf function
for libmvec and not a 2 element version.

I think this is due to aarch64_simd_clone_compute_vecsize_and_simdlen
which allows for (element count * element size) to be either 64
or 128.

I would like some thoughts on what we should do about this, should
we require glibc/libmvec to provide 2 element 32 bit floating point
vector functions (as well as the 4 element ones) or should we change
aarch64_simd_clone_compute_vecsize_and_simdlen to only allow 4
element (128 total bit size) vectors and not 2 element (64 total bit
size) ones?

This is obviously a question for the pre-SVE vector instructions,
I am not sure how this would be handled in SVE.

Steve Ellcey
sell...@marvell.com

P.S.  Here a test case in Fortran that generated the 2 element
  vector call.  It unrolled the loop into one vector call
  of 2 elements and one scalar call.

  SUBROUTINE FOO(B,W,P)
  REAL, DIMENSION (3) :: W, P
  DO 10 I = 1, 3
  P(I) = W(I) ** B
10CONTINUE
  END SUBROUTINE FOO


Re: [PATCH] Fix 2 clang warnings.

2019-06-27 Thread NightStrike
On Thu, Jun 27, 2019 at 11:16 AM Martin Sebor  wrote:
>
> On 6/27/19 8:03 AM, Martin Liška wrote:
> > Hi.
> >
> > This reduces 2 warnings reported by clang.
> >
> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> > Ready to be installed?
> > Thanks,
> > Martin
> >
> > gcc/ChangeLog:
> >
> > 2019-06-27  Martin Liska  
> >
> >   * edit-context.c (test_applying_fixits_unreadable_file): Do not
> >   use () for a constructor call.
> >   (test_applying_fixits_line_out_of_range): Likewise.
> >   * ggc-page.c (free_page): Use (char *) for %p printf format
> >   argument.
> > ---
> >   gcc/edit-context.c | 4 ++--
> >   gcc/ggc-page.c | 2 +-
> >   2 files changed, 3 insertions(+), 3 deletions(-)
>
> Just as a side note:
>
> diff --git a/gcc/edit-context.c b/gcc/edit-context.c
> index d3246ab0334..93d10664ae9 100644
> --- a/gcc/edit-context.c
> +++ b/gcc/edit-context.c
> @@ -1639,7 +1639,7 @@ static void
>   test_applying_fixits_unreadable_file ()
>   {
> const char *filename = "this-does-not-exist.txt";
> -  line_table_test ltt ();
> +  line_table_test ltt;
> linemap_add (line_table, LC_ENTER, false, filename, 1);
>
> location_t loc = linemap_position_for_column (line_table, 1);
>
> I'm guessing the warning above is about declaring a function
> returning line_table_test rather than declaring an object of
> the type.  (It would be helpful to show those warnings when
> fixing them.  A small test case confirms it's -Wvexing-parse:
> warning: empty parentheses interpreted as a function declaration.)
>
> This is a common mistake to make so it would be a useful
> warning to add to GCC as well.  And in fact, I see at least
> one request for it in Bugzilla: 25814 (with the related
> pr86564).
>
> Martin

Realistically, if it's worth fixing any warning that clang outputted
and that GCC doesn't. then perhaps the right fix is to add the
warning to GCC.  Or, if it's there but not in -Wall, add it to -Wall.
You're pointing out this specific case, but there've been several of
these patches that are all different.


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Martin Sebor

On 6/27/19 11:04 AM, Jakub Jelinek wrote:

On Thu, Jun 27, 2019 at 10:58:25AM -0600, Martin Sebor wrote:

The LHS is unsigned short so handle_char_store would not be called
because of the check in the caller.  You would need something like:

   MEM  [(char *)] = { 'a', 'b' };


This is invalid, because the rhs is non-empty CONSTRUCTOR that doesn't have
VECTOR_TYPE - verify_gimple_assign_single has:
 case CONSTRUCTOR:
   if (TREE_CODE (rhs1_type) == VECTOR_TYPE)
...
   else if (CONSTRUCTOR_NELTS (rhs1) != 0)
 {
   error ("non-vector %qs with elements", code_name);
   debug_generic_stmt (rhs1);
   return true;
 }

But
   MEM  [(char *)] = MEM  [(char *)"ab"];


Thanks.  A test case that uses this is below.  It's handled correctly
because the RHS is not all non-zero bytes (storing_nonzero_p is false.

The block Jeff is concerned about is the one below:

  else if (storing_nonzero_p
   && cmp > 0
   && TREE_CODE (TREE_TYPE (rhs)) == INTEGER_TYPE)
{
  gsi_next (gsi);
  return false;

so what would be needed to show an outstanding problem even with
the patch applied (i.e., the INTEGER_TYPE check) is a RHS that's
interpreted as all nonzero bytes despite having a nul byte in its
representation.  I.e., an integer that initializer_zerop() sets
the second argument to true for, or tree_expr_nonzero_p (RHS)
returns true for.

Both initializer_zerop and tree_expr_nonzero_p give up on this
representation even though they could handle it because it's
a constant.  Handling it would be an improvement but it would
still return false.  What we need is a scalar for the RHS that
is non-zero with a nul byte in it.  That would do it, but I
don't know how to create it or if it's even possible.

Martin


  char b[10];

  const char a[2] = "4";

  int f (const char *s)
  {
__builtin_strcpy (b, "123");

int i = __builtin_strcmp (s, b);

// MEM  [(char * {ref-all})]
//   = MEM  [(char * {ref-all})];
__builtin_memcpy (b, a, 2);

if (__builtin_strlen (b) != 1)
  __builtin_abort ();

return i;
  }


[Darwin, PPC, committed] Allow the user to override the use of hard float in kexts.

2019-06-27 Thread Iain Sandoe
The default for the kernel is soft-float, however a user writing a kernel
extension might want to make use of hard float.  This change makes
" -mkernel -mhard-float " work as expected.

tested on powerpc-darwin9
applied to mainline
thanks
Iain

2019-06-27  Iain Sandoe  

* config/rs6000/rs6000.c (darwin_rs6000_override_options): Honour
user-specified float mode choice for kernel mode code.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index fbff6bd..5e80673 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3430,7 +3430,10 @@ darwin_rs6000_override_options (void)
   if (flag_mkernel)
 {
   rs6000_default_long_calls = 1;
-  rs6000_isa_flags |= OPTION_MASK_SOFT_FLOAT;
+
+  /* Allow a kext author to do -mkernel -mhard-float.  */
+  if (! (rs6000_isa_flags_explicit & OPTION_MASK_SOFT_FLOAT))
+rs6000_isa_flags |= OPTION_MASK_SOFT_FLOAT;
 }
 
   /* Make -m64 imply -maltivec.  Darwin's 64-bit ABI includes



[Darwin, PPC, committed] Correct whitespace in specs.

2019-06-27 Thread Iain Sandoe
A recent merge dropped whitespace in the endfile specs, which affects
transactional memory cases.

applied to mainline
thanks
Iain

2019-06-27  Iain Sandoe  

* config/rs6000/darwin.h (ENDFILE_SPEC): Correct whitespace in the
spec.

--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -135,7 +135,7 @@ extern int darwin_emit_picsym_stub;
 /* The PPC regs save/restore functions are leaves and could, conceivably
be used by the tm destructor.  */
 #undef ENDFILE_SPEC
-#define ENDFILE_SPEC TM_DESTRUCTOR "-lef_ppc"
+#define ENDFILE_SPEC TM_DESTRUCTOR " -lef_ppc"
 
 #undef SUBTARGET_EXTRA_SPECS
 #define SUBTARGET_EXTRA_SPECS  \



[Darwin, PPC, committed] Do not use longcall for 64b code.

2019-06-27 Thread Iain Sandoe
The linker [ld64] that supports 64Bit does not need the JBSR longcall
optimisation, and will not work with the most generic case (where the
symbol is undefined external, but there is no symbl stub).  So switch
the longcall option off.  ld64 will generate branch islands as needed.

tested on powerpc-darwin9.
applied to mainline
thanks
Iain

2019-06-27  Iain Sandoe  

* config/rs6000/rs6000.c (darwin_rs6000_override_options): Do not
use longcall for 64b code.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3b59db5..fbff6bd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3418,6 +3418,15 @@ darwin_rs6000_override_options (void)
   rs6000_isa_flags |= OPTION_MASK_POWERPC64;
   warning (0, "%qs requires PowerPC64 architecture, enabling", "-m64");
 }
+
+  /* The linkers [ld64] that support 64Bit do not need the JBSR longcall
+ optimisation, and will not work with the most generic case (where the
+ symbol is undefined external, but there is no symbl stub).  */
+  if (TARGET_64BIT)
+rs6000_default_long_calls = 0;
+
+  /* ld_classic is (so far) still used for kernel (static) code, and supports
+ the JBSR longcall / branch islands.  */
   if (flag_mkernel)
 {
   rs6000_default_long_calls = 1;



Re: [PATCH] Deprecate -frepo option.

2019-06-27 Thread Jan Hubicka
> 
> > On 27 Jun 2019, at 19:21, Jan Hubicka  wrote:
> > 
> >> 
> >> It's useful on targets without COMDAT support.  Are there any such
> >> that we care about at this point?
> >> 
> >> If the problem is the combination with LTO, why not just prohibit that?
> > 
> > The problem is that at the collect2 time we want to decide whether to
> > hold stderr/stdout of the linker. The issue is that we do not know yet
> > if we are going to LTO (because that is decided by linker plugin) or
> > whether we do repo files (because we look for those only after linker
> > failure).
> 
> you could pre-scan for LTO, as is done for the collect2+non-plugin linker.
> (similar to the comment below, that would take some time, but probably
> small c.f the actual link).

the collect2+non-plugin does not handle static archives so it would need
more work while I think we should kill that code (and make darwin to use
lld)

Honza

> Iain
> 
> > We can look for repo files first but that could take some time
> > especially for large programs (probably not too bad compared to actual
> > linking later) or we may require -frepo to be passed to collect2.
> > 
> > Honza
> >> 
> >> Jason
> 


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Richard Biener
On June 27, 2019 7:04:32 PM GMT+02:00, Jakub Jelinek  wrote:
>On Thu, Jun 27, 2019 at 10:58:25AM -0600, Martin Sebor wrote:
>> The LHS is unsigned short so handle_char_store would not be called
>> because of the check in the caller.  You would need something like:
>> 
>>   MEM  [(char *)] = { 'a', 'b' };
>
>This is invalid, because the rhs is non-empty CONSTRUCTOR that doesn't
>have
>VECTOR_TYPE - verify_gimple_assign_single has:
>case CONSTRUCTOR:
>  if (TREE_CODE (rhs1_type) == VECTOR_TYPE)
>...
>  else if (CONSTRUCTOR_NELTS (rhs1) != 0)
>{
>  error ("non-vector %qs with elements", code_name);
>  debug_generic_stmt (rhs1);
>  return true;
>}
>
>But
>  MEM  [(char *)] = MEM  [(char *)"ab"];
>is valid.  Or = {} would be valid too, even for array stores.

And you can of course write GIMPLE unit tests for the pass using the GIMPLE FE. 

Richard. 

>   Jakub



Re: [PATCH 32/30] Document movmem/cpymem changes in gcc-10/changes.html

2019-06-27 Thread Aaron Sawdey
On 6/25/19 4:43 PM, Jeff Law wrote:
> On 6/25/19 2:22 PM, acsaw...@linux.ibm.com wrote:
>> From: Aaron Sawdey 
>>
>>  * builtins.c (get_memory_rtx): Fix comment.
>>  * optabs.def (movmem_optab): Change to cpymem_optab.
>>  * expr.c (emit_block_move_via_cpymem): Change movmem to cpymem.
>>  (emit_block_move_hints): Change movmem to cpymem.
>>  * defaults.h: Change movmem to cpymem.
>>  * targhooks.c (get_move_ratio): Change movmem to cpymem.
>>  (default_use_by_pieces_infrastructure_p): Ditto.
> So I think you're missing an update to the RTL/MD documentation.  This
> is also likely to cause problems for any out-of-tree ports, so it's
> probably worth a mention in the gcc-10 changes, which will need to be
> created (in CVS no less, ugh).
> 
> I think the stuff posted to-date is fine, but it shouldn't go in without
> the corresponding docs and gcc-10 changes updates.

Here is the corresponding documentation change for gcc-10/changes.html.

OK for trunk?

Thanks,
Aaron




Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-10/changes.html,v
retrieving revision 1.4
diff -r1.4 changes.html
139c139,149
< 
---
> Other significant improvements
> 
>   
> To allow inline expansion of both memcpy
> and memmove, the existing movmem instruction
> patterns used for non-overlapping memory copies have been renamed to
> cpymem. The movmem name is now used
> for overlapping memory moves, consistent with the
> library functions memcpy and memmove.
>   
> 

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain


Re: [PATCH] Deprecate -frepo option.

2019-06-27 Thread Iain Sandoe


> On 27 Jun 2019, at 19:21, Jan Hubicka  wrote:
> 
>> 
>> It's useful on targets without COMDAT support.  Are there any such
>> that we care about at this point?
>> 
>> If the problem is the combination with LTO, why not just prohibit that?
> 
> The problem is that at the collect2 time we want to decide whether to
> hold stderr/stdout of the linker. The issue is that we do not know yet
> if we are going to LTO (because that is decided by linker plugin) or
> whether we do repo files (because we look for those only after linker
> failure).

you could pre-scan for LTO, as is done for the collect2+non-plugin linker.
(similar to the comment below, that would take some time, but probably
small c.f the actual link).
Iain

> We can look for repo files first but that could take some time
> especially for large programs (probably not too bad compared to actual
> linking later) or we may require -frepo to be passed to collect2.
> 
> Honza
>> 
>> Jason



Re: [PATCH] Deprecate -frepo option.

2019-06-27 Thread Jan Hubicka
> 
> It's useful on targets without COMDAT support.  Are there any such
> that we care about at this point?
> 
> If the problem is the combination with LTO, why not just prohibit that?

The problem is that at the collect2 time we want to decide whether to
hold stderr/stdout of the linker. The issue is that we do not know yet
if we are going to LTO (because that is decided by linker plugin) or
whether we do repo files (because we look for those only after linker
failure).

We can look for repo files first but that could take some time
especially for large programs (probably not too bad compared to actual
linking later) or we may require -frepo to be passed to collect2.

Honza
> 
> Jason


Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Jason Merrill
On Mon, Jun 24, 2019 at 1:46 PM Jan Hubicka  wrote:
> >
> > I thought I remembered someone's recent-ish work to treat specially
> > types containing a char array, but I'm not finding it now.
> >
> > > For dynamically allocated memory as well as for stack space after stack
> > > slot sharing done in cfgexpand I see this is necessary since we do not
> > > preserve any information about placement new.
> >
> > Yes, untyped memory is different, but I'd think that memory allocated
> > through (non-placement) new should also be considered typed.
>
> I will try to return to this once the code is cleaned up. It would
> be quite interesting to make this better defined.
> > >
> > > I think this is what Richard refers to the code generating clobber
> > > statements that is only leaking as-base types to the middle-end visible
> > > part of IL and the code in call.c copying base structures.
> >
> > Right.  Is there a better way we could express copying/clobbering only
> > part of the object without involving the as-base types?
>
> I think currently two variants was discussed
>  1) use the trik Richard proposed for class a containing fake
> subclass a_as_base that has reduced size (i.e. is the AS_BASE type
> we have now) and adjust all component_refs accordingly introducing
> the view_convert_exprs for outer decls.
>
> Then clobbers and copies in call.c could use the as_base type.

This is the approach I talked about in 22488, which would also fix the
issue of fields with TYPE_SIZE larger than DECL_SIZE.

>  2) expose IS_FAKE_BASE_TYPE to middle-end and teach TBAA machinery
> about the fact that these are actually same types for everything
> it cares about.
>
> We have similar hacks for Fortran commons already, but of couse
> it is not most pretty (I outlined plan in previous mail)
>
> I would personally lean towards 2 since it will keep all component_refs
> in its current natural form and because I know how to implement it :)
> I guess it is Richi and yours call to pick variant which is better...

Both of these you mention still involve the as-base type.  I was
wondering if there was a way to avoid that.

> > > So at this time basically every C++ type can inter-operate with non-C++.
> > > I was thinking of relaxing this somewhat but wanted to see if C++
> > > standard says something here. Things that may be sensible include:
> > >  1) perhaps non-POD types especially those with vptr pointers do
> > > not need to be inter-operable.
> >
> > PODs were intended to be the C-compatible subset, yes.
> >
> > >  2) anonymous namespace types
> > >  3) types in namespace
> >
> > As long as these types don't have explicit language linkage (e.g.
> > extern "C"), sure.
>
> Great, I will add those to my TODOs.
> Do we have any way to tell language linkage from middle-end?

Not with a flag, currently, but you could look at the mangled name to
recognize extern "C".

Jason


[Committed] PF fortran/90987 -- Adjust parsing of COMMONI

2019-06-27 Thread Steve Kargl
I've committed the attached patch after successful
regression tests on x86_64-*-freebsd.  The patch
returns MATCH_NO when matching is done for a COMMON
statement, but the statement in fact is not COMMON.
See testcases.

While here I corrected the recording of the wrong
revision number in the ChangeLogs from my previous
commit.

2019-06-27  Steven G. Kargl  

PR fortran/90987
 * match.c (gfc_match_common): Adjust parsing of fixed and free form
source code containing, e.g., COMMONI.

2019-06-27  Steven G. Kargl  

PR fortran/90987
* gfortran.dg/common_1.f: new test.
* gfortran.dg/common_26.f90: Ditto.

-- 
Steve
Index: gcc/fortran/match.c
===
--- gcc/fortran/match.c	(revision 272755)
+++ gcc/fortran/match.c	(working copy)
@@ -5115,7 +5115,15 @@ gfc_match_common (void)
   gfc_array_spec *as;
   gfc_equiv *e1, *e2;
   match m;
+  char c;
 
+  /* COMMON has been matched.  In free form source code, the next character
+ needs to be whitespace or '/'.  Check that here.   Fixed form source
+ code needs to be checked below.  */
+  c = gfc_peek_ascii_char ();
+  if (gfc_current_form == FORM_FREE && !gfc_is_whitespace (c) && c != '/')
+return MATCH_NO;
+
   as = NULL;
 
   for (;;)
@@ -5279,10 +5287,24 @@ gfc_match_common (void)
 	  gfc_gobble_whitespace ();
 	  if (gfc_match_eos () == MATCH_YES)
 	goto done;
-	  if (gfc_peek_ascii_char () == '/')
+	  c = gfc_peek_ascii_char ();
+	  if (c == '/')
 	break;
-	  if (gfc_match_char (',') != MATCH_YES)
-	goto syntax;
+	  if (c != ',')
+	{
+	  /* In Fixed form source code, gfortran can end up here for an
+		 expression of the form COMMONI = RHS.  This may not be an
+		 error, so return MATCH_NO.  */
+	  if (gfc_current_form == FORM_FIXED && c == '=')
+		{
+		  gfc_free_array_spec (as);
+		  return MATCH_NO;
+		}
+	  goto syntax;
+	}
+	  else
+	gfc_match_char (',');
+
 	  gfc_gobble_whitespace ();
 	  if (gfc_peek_ascii_char () == '/')
 	break;
@@ -6248,6 +6270,7 @@ gfc_match_select_type (void)
   sym->attr.flavor = FL_VARIABLE;
   sym->attr.referenced = 1;
   sym->attr.class_ok = 1;
+  sym->attr.target = expr2->symtree->n.sym->attr.target;
 }
   else
 {
Index: gcc/testsuite/gfortran.dg/common_1.f
===
--- gcc/testsuite/gfortran.dg/common_1.f	(nonexistent)
+++ gcc/testsuite/gfortran.dg/common_1.f	(working copy)
@@ -0,0 +1,14 @@
+! { dg-do compile }
+  module mymod
+  type :: mytyp
+  integer :: i
+  end type mytyp
+  contains
+  subroutine mysub
+  implicit none
+  type(mytyp) :: a
+  integer :: commoni,commonj
+  commoni = a%i
+  commonj = a%j  ! { dg-error "is not a member of" }
+  end subroutine mysub
+  end module mymod
Index: gcc/testsuite/gfortran.dg/common_26.f90
===
--- gcc/testsuite/gfortran.dg/common_26.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/common_26.f90	(working copy)
@@ -0,0 +1,14 @@
+! { dg-do compile }
+module mymod
+  type :: mytyp
+integer :: i
+  end type mytyp
+contains
+  subroutine mysub
+implicit none
+type(mytyp) :: a
+integer :: commoni,commonj
+commoni = a%i
+commonj = a%j ! { dg-error "is not a member of" }
+  end subroutine mysub
+end module mymod


Re: [PATCH] Deprecate -frepo option.

2019-06-27 Thread Jason Merrill
On Thu, Jun 27, 2019 at 10:23 AM Martin Liška  wrote:
>
> On 6/27/19 2:58 PM, Jonathan Wakely wrote:
> > On Thu, 27 Jun 2019 at 13:30, Martin Liška  wrote:
> >>
> >> On 6/21/19 4:28 PM, Richard Biener wrote:
> >>> On Fri, Jun 21, 2019 at 4:13 PM Jakub Jelinek  wrote:
> 
>  On Fri, Jun 21, 2019 at 04:04:00PM +0200, Martin Liška wrote:
> > On 6/21/19 1:58 PM, Jakub Jelinek wrote:
> >> On Fri, Jun 21, 2019 at 01:52:09PM +0200, Martin Liška wrote:
> >>> On 6/21/19 1:47 PM, Jonathan Wakely wrote:
>  On Fri, 21 Jun 2019 at 11:40, Martin Liška wrote:
> > Yes, I would be fine to deprecate that for GCC 10.1
> 
>  Would it be appropriate to issue a warning in GCC 10.x if the option 
>  is used?
> >>>
> >>> Sure. With the patch attached one will see:
> >>>
> >>> $ gcc -frepo /tmp/main.cc -c
> >>> gcc: warning: switch ‘-frepo’ is no longer supported
> >>>
> >>> I'm sending patch that also removes -frepo tests from test-suite.
> >>> I've been testing the patch.
> >>
> >> IMHO for just deprecation of an option you don't want to remove it 
> >> from the
> >> testsuite, just match the warning it will generate in those tests, and
> >> I'm not convinced you want to remove it from the documentation (rather 
> >> than
> >> just saying in the documentation that the option is deprecated and 
> >> might be
> >> removed in a later GCC version).
> >
> > Agree with you. I'm sending updated version of the patch.
> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
>  I'm also not convinced about the Deprecated flag, seems like that is a 
>  flag
>  that we use for options that have been already removed.
>  So, instead there should be some proper warning in the C++ FE for it,
>  or just Warn.
> >>>
> >>> In principle -frepo is a nice idea - does it live up to its promises?  
> >>> That is,
> >>> does it actually work, for example when throwing it on the libstdc++
> >>> testsuite or a larger C++ project?
> >>
> >> @Jonathan, Jason: Do we know whether it really work?
> >
> > I don't know. It's nearly 20 years since I've tried it, but apparently
> > a few people try using it:
> > https://stackoverflow.com/search?q=frepo+c%2B%2B
> >
> > The first result was answered by me in 2012 saying nobody uses it:
> > https://stackoverflow.com/a/11832613/981959
>
> Looks at this, it seems to me very legacy and I would recommend to remove it.

It's useful on targets without COMDAT support.  Are there any such
that we care about at this point?

If the problem is the combination with LTO, why not just prohibit that?

Jason


Re: [PATCH] Remove another bunch of dead assignment.

2019-06-27 Thread Richard Sandiford
Martin Liška  writes:
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index d50b811d863..1bd251ea8e2 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -19780,8 +19780,7 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
>emit_insn (gen_vec_widen_umult_even_v4si (t5, 
>   gen_lowpart (V4SImode, op1),
>   gen_lowpart (V4SImode, op2)));
> -  op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
> -
> +  expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
>  }
>else
>  {

This means that nothing uses the expanded value.  It looks like the
call was meant to be force_expand_binop instead, so that the expansion
always goes to op0.

Richard


Re: [PATCH 1/3] C++20 constexpr lib part 1/3

2019-06-27 Thread Ville Voutilainen
On Thu, 27 Jun 2019 at 19:55, Ed Smith-Rowland via libstdc++
 wrote:
> > I don't think this will work in a constant expression:
> >
> > ?? /// Assign @p __new_val to @p __obj and return its previous value.
> > ?? template 
> > +?? _GLIBCXX20_CONSTEXPR
> > ?? inline _Tp
> > ?? exchange(_Tp& __obj, _Up&& __new_val)
> > ?? { return std::__exchange(__obj, std::forward<_Up>(__new_val)); }
> >
> > Because std::__exchange hasn't been marked constexpr. The test passes
> > because it doesn't call it in a context that requires constant
> > evaluation:
> >
> > ??const auto x = std::exchange(e, pi);
> Derp.
> >
> > I see the same problem in other tests too:
> >
> > +?? constexpr std::array car{{0, 1, 2, 3, 4, 5, 6, 6, 8, 9, 9,
> > 11}};
> > +
> > +?? const auto out0x = std::adjacent_find(car.begin(), car.end());
> > +
> > +?? const auto out1x = std::adjacent_find(car.begin(), car.end(),
> > +?? std::equal_to())
>
> I will go through all the tests and make sure that some nontrivial
> calculation is done that contributes to a final bool return.?? And clean
> up the mess.?? I did this in chunk 2 but I guess I left a lot of chunk 1.

As was the case with the iterator patch, it's not a question of doing
a nontrivial calculation, but
a question of initializing a constexpr variable with the result. (Yeah
sure a non-type template
argument would also work but eurgh). Then, and only then, have we
proven that the code
works in a constexpr context. Initializing a const doesn't do that.
constinit would, but that's
C++20.


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Jakub Jelinek
On Thu, Jun 27, 2019 at 10:58:25AM -0600, Martin Sebor wrote:
> The LHS is unsigned short so handle_char_store would not be called
> because of the check in the caller.  You would need something like:
> 
>   MEM  [(char *)] = { 'a', 'b' };

This is invalid, because the rhs is non-empty CONSTRUCTOR that doesn't have
VECTOR_TYPE - verify_gimple_assign_single has:
case CONSTRUCTOR:
  if (TREE_CODE (rhs1_type) == VECTOR_TYPE)
...
  else if (CONSTRUCTOR_NELTS (rhs1) != 0)
{
  error ("non-vector %qs with elements", code_name);
  debug_generic_stmt (rhs1);
  return true;
}

But
  MEM  [(char *)] = MEM  [(char *)"ab"];
is valid.  Or = {} would be valid too, even for array stores.

Jakub


Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread H.J. Lu
On Thu, Jun 27, 2019 at 5:26 AM Uros Bizjak  wrote:
>
> On Thu, Jun 27, 2019 at 2:23 PM Jan Beulich  wrote:
> >
> > >>> On 27.06.19 at 14:00,  wrote:
> > > On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich  wrote:
> > >>
> > >> >>> On 27.06.19 at 13:09,  wrote:
> > >> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich  wrote:
> > >> >>
> > >> >> Without these constraints asm() can't make use of mask registers.
> > >> >
> > >> > asm should be deprecated. We have intrinsics for this purpose.
> > >>
> > >> While maybe not explicitly applicable here, the intrinsics aren't
> > >> (afaict) providing full flexibility. In particular (just as example)
> > >> I haven't found a way to use embedded broadcast with the
> > >> intrinsics, but I can easily do so with asm().
> > >>
> > >> Furthermore there are other reasons to use asm() - things like
> > >> the Linux kernel are full of it for a reason. And once one has
> > >> to use asm(), the resulting code typically is easier to follow if
> > >> one doesn't further intermix it with uses of builtins.
> > >>
> > >> And finally, if asm() was indeed meant to be deprecated, how
> > >> come it pretty recently got extended to allow for "inline"?
> > >
> > > I didn't mean that asm() in general should be deprecated, but for SSE
> > > and other vector extensions, where intrinsics are available,
> > > intrinsics should be used instead. There was exactly zero requests to
> > > use new asm constraints, it looks that people are satisfied with
> > > intrinsics approach (which is also future-proof, etc).
> >
> > So what about my embedded broadcast example then? "Zero
> > requests" is clearly not exactly right. It simply didn't occur to me
> > (until I noticed the @internal here) that I should raise such a
> > request, rather than just using asm(). Subsequently I did then
> > notice "Yh" going away, complicating things further ...
>
> There was some work by HJ a while ago that implemented automatic
> generation of embedded broadcasts. Perhaps he can answer the question.

I implemented broadcast for some commonly used instructions.  Adding
broadcast to all will take time, especially when there is no user demand.

> Uros.
>
> > I'd also like to note that the choice of types on some of the builtins
> > makes it rather cumbersome to use them. Especially for scalar
> > operations I've found myself better resorting to asm(). See
> > https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/x86_emulator/simd.c
> > (most of the changes submitted (not so) recently have been
> > coming from the work of putting together this and its sibling
> > tests for the Xen Project instruction emulator).
> >
> > Jan
> >
> >



-- 
H.J.


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Martin Sebor

On 6/27/19 9:15 AM, Jeff Law wrote:

On 6/27/19 9:00 AM, Jeff Law wrote:

On 6/26/19 8:40 PM, Martin Sebor wrote:

On 6/26/19 4:31 PM, Jeff Law wrote:

On 6/25/19 5:03 PM, Martin Sebor wrote:



The caller ensures that handle_char_store is only called for stores
to arrays (MEM_REF) or single elements as wide as char.

Where?  I don't see it, even after fixing the formatting in
strlen_check_and_optimize_stmt :-)


    gimple *stmt = gsi_stmt (*gsi);

    if (is_gimple_call (stmt))


I think we can agree that we don't have a call, so this is false.


   else if (is_gimple_assign (stmt) && !gimple_clobber_p (stmt))
  {
    tree lhs = gimple_assign_lhs (stmt);

This should be true IIUC, so we'll go into its THEN block.



    if (TREE_CODE (lhs) == SSA_NAME && POINTER_TYPE_P (TREE_TYPE
(lhs)))

Should be false.


    else if (TREE_CODE (lhs) == SSA_NAME && INTEGRAL_TYPE_P
(TREE_TYPE (lhs)))


Should also be false.


    else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))

Should be true since LHS is a MEM_REF.



     {
    tree type = TREE_TYPE (lhs);
    if (TREE_CODE (type) == ARRAY_TYPE)
  type = TREE_TYPE (type);
    if (TREE_CODE (type) == INTEGER_TYPE
    && TYPE_MODE (type) == TYPE_MODE (char_type_node)
    && TYPE_PRECISION (type) == TYPE_PRECISION
(char_type_node))
  {
    if (! handle_char_store (gsi))
  return false;
  }
  }

If TREE_TYPE (type) is an ARRAY_TYPE, we strip the ARRAY_TYPE.  We then
verify that TYPE is a single character type.  _But_ we stripped away the
ARRAY_TYPE.  So ISTM that we allow either an array of chars or a single
char on the LHS.

So how does this avoid multiple character stores?!?  We could have had
an ARRAY_REF on the LHS and we never check the number of elements in the
array.  There's no check on the RHS either.  SO I don't see how we
guarantee that we're dealing with a single character store.

What am I missing here?


Can you show me a test case where it breaks?  If not, I don't know
what you want me to do.  I'll just move on to something else.

THe thing to do is research what gimple accepts and what other passes
may do.  Given this is a code generation bug, "just moving on" isn't
really a good option unless we're going to rip out that little bit of code.

As I was thinking about this last night, the pass we'd want to look at
would be store merging.  Let's do that on the call today.

Actually it was trivial to create with store merging.

char x[20];
foo()
{
   x[0] = 0x41;
   x[1] = 0x42;
}

   MEM  [(char *)] = 16961;


The LHS is unsigned short so handle_char_store would not be called
because of the check in the caller.  You would need something like:

  MEM  [(char *)] = { 'a', 'b' };

or something like that where the type is an array of char.  I have
no idea if something like it can happen or how.  As it is, I'm
pretty sure it's not valid (but who knows).


So clearly we can get this in gimple.  So we need a check of some kind,
either on the LHS or the RHS to ensure that we actually have a single
character store as opposed to something like the above.


As far as I can see that describes the check in the caller.

I never said the MEM_REF you show above can't come up in Gimple.
What I am saying is that I don't know how to get it to cause
a problem here, or even to make it into the function to begin
with.  Store merging doesn't do it because it runs after strlen.
If it did run earlier as I said above, it wouldn't make it into
the function.  I tried it to convince myself.

Without a test case showing how something can happen I don't know
what to do to prevent it.  You're throwing out hypothetical
scenarios based on what's valid Gimple and expect me to write
code to handle them.  I'm sorry but I don't work that way.  I
need a test case to see a problem, understand it, and to verify
I have fixed it properly.

In this instance, I can't think of a problem here, either real
or hypothetical, so it's impossible for me to do something to
prevent it.  Even if the multicharacter store did make it into
the function I don't even understand what it is you want me to
do -- we'd have to look at the value of the RHS, find a nul in
its representation if there is one, and either compute the new
string length from it or invalidate the existing length.  Either
way it's not trivial and would need to be tested to make sure
it's correct.  So we're back to a test case.

Martin

PS Here's a test case that can be made to fail but only if a)
store merging runs before strlen, and also b) the LHS check
above is disabled.

  char b[4];

  int f (const char *s)
  {
__builtin_strcpy (b, "123");

int i = __builtin_strcmp (s, b);

b[0] = 'a';
b[1] = 0;

//  MEM  [(char *)] = 97;

if (__builtin_strlen (b) != 1)
  __builtin_abort ();

return i;
  }


Re: [PATCH 1/3] C++20 constexpr lib part 1/3

2019-06-27 Thread Ed Smith-Rowland via gcc-patches

On 6/27/19 11:41 AM, Jonathan Wakely wrote:

On 26/06/19 19:13 -0400, Ed Smith-Rowland via libstdc++ wrote:

Here is the first of three patches for C++20 constexpr library.

?? Implement C++20 p0202 - Add constexpr Modifiers to Functions 
in  and  Headers.
 ??Implement C++20 p1023 - constexpr comparison operators for 
std::array.


I believe I have answered peoples concerns with the last patch 
attempts [https://gcc.gnu.org/ml/libstdc++/2019-03/msg00132.html].


The patch is large because of test cases but really just boils down 
to adding constexpr for c++2a.


I still have some concerns about the changes like:

?? template
+?? _GLIBCXX20_CONSTEXPR
?? bool
-?? operator()(_Iterator __it, _Value& __val) const
+?? operator()(_Iterator __it, const _Value& __val) const
?? { return *__it < __val; }

Make a type where operator< changes the rhs.?? I was thinking you'd want 
a compare to operate on const things but I guess we have to be ready for 
anything.?? I was also thinking < is left associative for a class member 
but i guess a class could have a thing that mutates the rhs too.?? 
Nothing got hit in any of my testing.?? If this is a thing we probably 
should make a general test for things like this somewhere.?? Maybe 
someone wants to log references or something.


1. I'll experiment to see if I really need these (I *thought* I did, but...)

2. If I still do then I'll make overloads?



I think that might change semantics for some types, and I haven't yet
figured out if we care about those cases or not. I'll try to come up
with some examples that change meaning.

This could use _GLIBCXX14_CONSTEXPR:

+?? /**
+ * A constexpr wrapper for __builtin_memmove.
+ * @param __num The number of elements of type _Tp (not bytes).

I don't think we want a Doxygen comment for this, as it's not part of
the public API. Just a normal comment is fine.

+ */
+?? template
+?? _GLIBCXX20_CONSTEXPR
+?? inline void*
+?? __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
+?? {
+#if __cplusplus > 201703L \
+?? && defined(_GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED)
+?? if (__builtin_is_constant_evaluated())
+?? {
+?? for(; __num > 0; --__num)
+?? {
+?? if constexpr (_IsMove)
+?? *__dst = std::move(*__src);
+?? else
+?? *__dst = *__src;

Do we need this _IsMove condition here? We should only be using this
function for trivially copyable types, in which case copy vs move
shouldn't matter, should it?

Oh ... is it because we also end up using this __memmove function for
non-trivially copyable types, during constant evaluation? Hmm. Then
it's more than just a wrapper for __builtin_memmove, right? It's
"either memmove or constexpr range copy", or something.


Yup.?? I'm also bad at naming things.?? Furthermore, I expect the 
standards people to want to have our versions of memcpy, memcmp, memmove 
that we own and can make constexpr.?? These are such important concepts.


Also, part 2 brings in constexpr swap. Maybe this is the may to 
implement an actual memmove?




I don't think this will work in a constant expression:

?? /// Assign @p __new_val to @p __obj and return its previous value.
?? template 
+?? _GLIBCXX20_CONSTEXPR
?? inline _Tp
?? exchange(_Tp& __obj, _Up&& __new_val)
?? { return std::__exchange(__obj, std::forward<_Up>(__new_val)); }

Because std::__exchange hasn't been marked constexpr. The test passes
because it doesn't call it in a context that requires constant
evaluation:

??const auto x = std::exchange(e, pi);

Derp.


I see the same problem in other tests too:

+?? constexpr std::array car{{0, 1, 2, 3, 4, 5, 6, 6, 8, 9, 9, 
11}};

+
+?? const auto out0x = std::adjacent_find(car.begin(), car.end());
+
+?? const auto out1x = std::adjacent_find(car.begin(), car.end(),
+?? std::equal_to())


I will go through all the tests and make sure that some nontrivial 
calculation is done that contributes to a final bool return.?? And clean 
up the mess.?? I did this in chunk 2 but I guess I left a lot of chunk 1.


Come to think of it, we could build *insane* concepts after this.?? For 
good or ill.


Ed



[PATCH], PR Fix floatsi{sf,df}2 and floatunssi{sf,df}2 for a future powerpc machine

2019-06-27 Thread Michael Meissner
As I detail in PR 91009, I had some testsuite failures with my patches for a
future machine.  In the future patches, I added new RTL attributes to support
the new prefixed load/store instructions (which will include pc-relative
support).  This new attribute needs to look at the 'type' RTL attribute, and
that in turn depends on doing a constrain operands to determine whether the
insn is a load, store, or add instruction.

This constrain operands call occurs before the first split pass.  The insns
that convert SImode to SFmode or DFmode which are split in the first pass,
currently use the "wa" constraint for the floating point result.  Unfortunately
if the user uses the -mno-vsx option, the "wa" constraint becomes NO_REGS.

This patch is a rather limited patch that just adds an alternative using the
"d" constraint, so there is a valid alternative before splitting.

This patch does not do the cleanup mentioned in PR 90822.  That will still be
done, but I wanted to fix the actual problem before doing the next iteration of
the code cleanup.

I have done a bootstrap and make check, and there were no regressions.  Can I
check this into the trunk?

2019-06-27   Michael Meissner  

PR target/91009
* config/rs6000/rs6000.md (floatsi2_lfiwax): Add non-VSX
alternative.
(floatsi2_lfiwax_mem): Add non-VSX alternative.
(floatunssi2_lfiwzx): Add non-VSX alternative.
(floatunssi2_lfiwzx_mem): Add non-VSX alternative.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 272436)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -5227,9 +5227,9 @@ (define_insn "lfiwax"
 ; not be needed and also in case the insns are deleted as dead code.
 
 (define_insn_and_split "floatsi2_lfiwax"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
-   (float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wa"))]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,")
+   (float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r,r")))
+   (clobber (match_scratch:DI 2 "=d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_LFIWAX
&&  && can_create_pseudo_p ()"
   "#"
@@ -5266,11 +5266,11 @@ (define_insn_and_split "floatsi2_l
(set_attr "type" "fpload")])
 
 (define_insn_and_split "floatsi2_lfiwax_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,")
(float:SFDF
 (sign_extend:DI
- (match_operand:SI 1 "indexed_or_indirect_operand" "Z"
-   (clobber (match_scratch:DI 2 "=wa"))]
+ (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"
+   (clobber (match_scratch:DI 2 "=d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_LFIWAX && "
   "#"
   ""
@@ -5303,9 +5303,9 @@ (define_insn "lfiwzx"
(set_attr "isa" "*,p8v,p8v,p9v")])
 
 (define_insn_and_split "floatunssi2_lfiwzx"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
-   (unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wa"))]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,")
+   (unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r,r")))
+   (clobber (match_scratch:DI 2 "=d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_LFIWZX && "
   "#"
   ""
@@ -5341,11 +5341,11 @@ (define_insn_and_split "floatunssi
(set_attr "type" "fpload")])
 
 (define_insn_and_split "floatunssi2_lfiwzx_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,")
(unsigned_float:SFDF
 (zero_extend:DI
- (match_operand:SI 1 "indexed_or_indirect_operand" "Z"
-   (clobber (match_scratch:DI 2 "=wa"))]
+ (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"
+   (clobber (match_scratch:DI 2 "=d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_LFIWZX && "
   "#"
   ""

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Jakub Jelinek
On Thu, Jun 27, 2019 at 09:15:41AM -0600, Jeff Law wrote:
> Actually it was trivial to create with store merging.
> 
> char x[20];
> foo()
> {
>   x[0] = 0x41;
>   x[1] = 0x42;
> }
> 
>   MEM  [(char *)] = 16961;
> 
> So clearly we can get this in gimple.  So we need a check of some kind,
> either on the LHS or the RHS to ensure that we actually have a single
> character store as opposed to something like the above.

handle_char_store is only called if:
  tree type = TREE_TYPE (lhs);
  if (TREE_CODE (type) == ARRAY_TYPE)
type = TREE_TYPE (type);
  if (TREE_CODE (type) == INTEGER_TYPE
  && TYPE_MODE (type) == TYPE_MODE (char_type_node)
  && TYPE_PRECISION (type) == TYPE_PRECISION (char_type_node))
{
  if (! handle_char_store (gsi))
return false;
}
so the above case will not trigger, the only way to call it for multiple
character stores is if you have an array.  And so far I've succeeded
creating that just with strcpy builtin.
E.g. the
  if (storing_all_zeros_p && cmp == 0 && si->full_string_p)
{
  /* When overwriting a '\0' with a '\0', the store can be removed
 if we know it has been stored in the current function.  */
  if (!stmt_could_throw_p (cfun, stmt) && si->writable)
optimization looks unsafe if we are actually storing more than one zero
because then the first zero in there is redundant, but there could be
non-zero chars after that and removing the larger all zeros store will
keep those other chars with incorrect values; I've tried to reproduce it with:
__attribute__((noipa)) void
bar (char *x, int l1, int l2)
{
  if (__builtin_memcmp (x, "abcd\0", 6) != 0 || l1 != 4 || l2 != 4)
__builtin_abort ();
}

int
main ()
{
  char x[64];
  __builtin_memset (x, -1, sizeof (x));
  asm volatile ("" : : "g" ([0]) : "memory");
  __builtin_strcpy (x, "abcd");
  int l1 = __builtin_strlen (x);
#ifdef ONE
  short __attribute__((may_alias)) *p = (void *) [0];
  *(p + 2) = 0;
#else
  short *p = (short *) ([0] + 4);
  __builtin_memcpy (p, "\0\0", 2);
#endif
  int l2 = __builtin_strlen (x);
  bar (x, l1, l2);
}

but the first case creates a short store, so handle_char_store isn't
called, and second for some reason isn't folded from memcpy to a char[2]
store.

Fundamentally, there is no reason to guard handle_char_store with
char or array of char stores, as long as CHAR_BIT == 8 && BITS_PER_UNIT == 8
and native_encode_expr can handle the rhs expression - then we can analyze
exactly where is the first '\0' character in there and in the code take it
into account.

Jakub


Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

2019-06-27 Thread Jeff Law
On 6/27/19 12:11 AM, Li Jia He wrote:
> Hi,
> 
> According to the optimizable case described by Qi Feng on
> issue 88784, we can combine the cases into the following:
> 
> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> 2. x >  y  &&  x == XXX_MIN  -->   false
> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> 
> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> 5. x <  y  &&  x == XXX_MAX  -->   false
> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> 
> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> 8. x <= y  ||  x != XXX_MIN  -->   true
> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> 
> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> 11. x >= y  ||  x != XXX_MAX  -->   true
> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> 
> Note: XXX_MIN represents the minimum value of type x.
>   XXX_MAX represents the maximum value of type x.
> 
> Here we don't need to care about whether the operation is
> signed or unsigned.  For example, in the below equation:
> 
> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> 
> If the x type is signed int and XXX_MIN is INT_MIN, we can
> optimize it to 'x > y'.  However, if the type of x is unsigned
> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> 
> The regression testing for the patch was done on GCC mainline on
> 
> powerpc64le-unknown-linux-gnu (Power 9 LE)
> 
> with no regressions.  Is it OK for trunk ?
> 
> Thanks,
> Lijia He
> 
> gcc/ChangeLog
> 
> 2019-06-27  Li Jia He  
>   Qi Feng  
> 
>   PR middle-end/88784
>   * gimple-fold.c (and_comparisons_contain_equal_operands): New function.
>   (and_comparisons_1): Use and_comparisons_contain_equal_operands.
>   (or_comparisons_contain_equal_operands): New function.
>   (or_comparisons_1): Use or_comparisons_contain_equal_operands.
Would this be better done via match.pd?  ISTM this transformation would
be well suited for that framework.

jeff


Re: [PATCH 1/3] C++20 constexpr lib part 1/3

2019-06-27 Thread Jonathan Wakely

On 26/06/19 19:13 -0400, Ed Smith-Rowland via libstdc++ wrote:

Here is the first of three patches for C++20 constexpr library.

?? Implement C++20 p0202 - Add constexpr Modifiers to Functions in 
 and  Headers.

 ??Implement C++20 p1023 - constexpr comparison operators for std::array.

I believe I have answered peoples concerns with the last patch 
attempts [https://gcc.gnu.org/ml/libstdc++/2019-03/msg00132.html].


The patch is large because of test cases but really just boils down to 
adding constexpr for c++2a.


I still have some concerns about the changes like:

template
+  _GLIBCXX20_CONSTEXPR
  bool
-  operator()(_Iterator __it, _Value& __val) const
+  operator()(_Iterator __it, const _Value& __val) const
  { return *__it < __val; }


I think that might change semantics for some types, and I haven't yet
figured out if we care about those cases or not. I'll try to come up
with some examples that change meaning.

This could use _GLIBCXX14_CONSTEXPR:

+  /**
+   * A constexpr wrapper for __builtin_memmove.
+   * @param __num The number of elements of type _Tp (not bytes).

I don't think we want a Doxygen comment for this, as it's not part of
the public API. Just a normal comment is fine.

+   */
+  template
+_GLIBCXX20_CONSTEXPR
+inline void*
+__memmove(_Tp* __dst, const _Tp* __src, size_t __num)
+{
+#if __cplusplus > 201703L \
+&& defined(_GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED)
+  if (__builtin_is_constant_evaluated())
+   {
+ for(; __num > 0; --__num)
+   {
+ if constexpr (_IsMove)
+   *__dst = std::move(*__src);
+ else
+   *__dst = *__src;

Do we need this _IsMove condition here? We should only be using this
function for trivially copyable types, in which case copy vs move
shouldn't matter, should it?

Oh ... is it because we also end up using this __memmove function for
non-trivially copyable types, during constant evaluation? Hmm. Then
it's more than just a wrapper for __builtin_memmove, right? It's
"either memmove or constexpr range copy", or something.

I don't think this will work in a constant expression:

  /// Assign @p __new_val to @p __obj and return its previous value.
  template 
+_GLIBCXX20_CONSTEXPR
inline _Tp
exchange(_Tp& __obj, _Up&& __new_val)
{ return std::__exchange(__obj, std::forward<_Up>(__new_val)); }

Because std::__exchange hasn't been marked constexpr. The test passes
because it doesn't call it in a context that requires constant
evaluation:

 const auto x = std::exchange(e, pi);

I see the same problem in other tests too:

+  constexpr std::array car{{0, 1, 2, 3, 4, 5, 6, 6, 8, 9, 9, 11}};
+
+  const auto out0x = std::adjacent_find(car.begin(), car.end());
+
+  const auto out1x = std::adjacent_find(car.begin(), car.end(),
+   std::equal_to());





Re: [RFC PATCH, i386]: Autovectorize 8-byte vectors

2019-06-27 Thread Jeff Law
On 6/27/19 9:34 AM, Jakub Jelinek wrote:
> On Thu, Jun 27, 2019 at 09:24:58AM -0600, Jeff Law wrote:
>> On 6/27/19 12:05 AM, Jakub Jelinek wrote:
>>> On Wed, Jun 26, 2019 at 12:19:28PM +0200, Uros Bizjak wrote:
 Yes, the patch works OK. I'll regression test it and push it later today.
>>>
>>> I think it caused
>>> +FAIL: gcc.dg/tree-ssa/pr84512.c scan-tree-dump optimized "return 285;"
>>> which admittedly already is xfailed on various targets.
>>> We now newly vectorize those loops and there is no FRE or similar pass
>>> after vectorization to clean it up, in particular optimize the
>>> a[8] and a[9] loads given the MEM  [(int *) + 32B]
>>> store:
>>>   MEM  [(int *) + 32B] = { 64, 81 };
>>>   _13 = a[8];
>>>   res_6 = _13 + 140;
>>>   _18 = a[9];
>>>   res_15 = res_6 + _18;
>>>   a ={v} {CLOBBER};
>>>   return res_15;
>>>
>>> Shall we xfail it, or is there a plan to enable FRE after vectorization,
>>> or similar pass that would be able to do similar memory optimizations?
>>> Note, the RTL passes are able to optimize it in the end in this testcase.
>> I wonder if we could logically break up the vector store within DOM.  If
>> we did that we'd end up with a[8] and a[9] in DOM's expression hash
>> table.  That would allow us to replace the loads into _13 and _18 with
>> constants and the rest should just fall out.
>>
>> Care to open a BZ?  If so, go ahead and assign it to me.
> 
> I think Richi is on working on adding fre3 now.
Yea, I saw that later.  I think Richi's message indicated he wanted a
late fre pass, so even if DOM was to capture this, it may not eliminate
the desire for a late fre pass.

jeff


Re: [RFC PATCH, i386]: Autovectorize 8-byte vectors

2019-06-27 Thread Jakub Jelinek
On Thu, Jun 27, 2019 at 09:24:58AM -0600, Jeff Law wrote:
> On 6/27/19 12:05 AM, Jakub Jelinek wrote:
> > On Wed, Jun 26, 2019 at 12:19:28PM +0200, Uros Bizjak wrote:
> >> Yes, the patch works OK. I'll regression test it and push it later today.
> > 
> > I think it caused
> > +FAIL: gcc.dg/tree-ssa/pr84512.c scan-tree-dump optimized "return 285;"
> > which admittedly already is xfailed on various targets.
> > We now newly vectorize those loops and there is no FRE or similar pass
> > after vectorization to clean it up, in particular optimize the
> > a[8] and a[9] loads given the MEM  [(int *) + 32B]
> > store:
> >   MEM  [(int *) + 32B] = { 64, 81 };
> >   _13 = a[8];
> >   res_6 = _13 + 140;
> >   _18 = a[9];
> >   res_15 = res_6 + _18;
> >   a ={v} {CLOBBER};
> >   return res_15;
> > 
> > Shall we xfail it, or is there a plan to enable FRE after vectorization,
> > or similar pass that would be able to do similar memory optimizations?
> > Note, the RTL passes are able to optimize it in the end in this testcase.
> I wonder if we could logically break up the vector store within DOM.  If
> we did that we'd end up with a[8] and a[9] in DOM's expression hash
> table.  That would allow us to replace the loads into _13 and _18 with
> constants and the rest should just fall out.
> 
> Care to open a BZ?  If so, go ahead and assign it to me.

I think Richi is on working on adding fre3 now.

Jakub


Re: [RFC PATCH, i386]: Autovectorize 8-byte vectors

2019-06-27 Thread Jeff Law
On 6/27/19 12:05 AM, Jakub Jelinek wrote:
> On Wed, Jun 26, 2019 at 12:19:28PM +0200, Uros Bizjak wrote:
>> Yes, the patch works OK. I'll regression test it and push it later today.
> 
> I think it caused
> +FAIL: gcc.dg/tree-ssa/pr84512.c scan-tree-dump optimized "return 285;"
> which admittedly already is xfailed on various targets.
> We now newly vectorize those loops and there is no FRE or similar pass
> after vectorization to clean it up, in particular optimize the
> a[8] and a[9] loads given the MEM  [(int *) + 32B]
> store:
>   MEM  [(int *) + 32B] = { 64, 81 };
>   _13 = a[8];
>   res_6 = _13 + 140;
>   _18 = a[9];
>   res_15 = res_6 + _18;
>   a ={v} {CLOBBER};
>   return res_15;
> 
> Shall we xfail it, or is there a plan to enable FRE after vectorization,
> or similar pass that would be able to do similar memory optimizations?
> Note, the RTL passes are able to optimize it in the end in this testcase.
I wonder if we could logically break up the vector store within DOM.  If
we did that we'd end up with a[8] and a[9] in DOM's expression hash
table.  That would allow us to replace the loads into _13 and _18 with
constants and the rest should just fall out.

Care to open a BZ?  If so, go ahead and assign it to me.

jeff


Re: [PATCH] Fix 2 clang warnings.

2019-06-27 Thread Martin Sebor

On 6/27/19 8:03 AM, Martin Liška wrote:

Hi.

This reduces 2 warnings reported by clang.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-27  Martin Liska  

* edit-context.c (test_applying_fixits_unreadable_file): Do not
use () for a constructor call.
(test_applying_fixits_line_out_of_range): Likewise.
* ggc-page.c (free_page): Use (char *) for %p printf format
argument.
---
  gcc/edit-context.c | 4 ++--
  gcc/ggc-page.c | 2 +-
  2 files changed, 3 insertions(+), 3 deletions(-)


Just as a side note:

diff --git a/gcc/edit-context.c b/gcc/edit-context.c
index d3246ab0334..93d10664ae9 100644
--- a/gcc/edit-context.c
+++ b/gcc/edit-context.c
@@ -1639,7 +1639,7 @@ static void
 test_applying_fixits_unreadable_file ()
 {
   const char *filename = "this-does-not-exist.txt";
-  line_table_test ltt ();
+  line_table_test ltt;
   linemap_add (line_table, LC_ENTER, false, filename, 1);

   location_t loc = linemap_position_for_column (line_table, 1);

I'm guessing the warning above is about declaring a function
returning line_table_test rather than declaring an object of
the type.  (It would be helpful to show those warnings when
fixing them.  A small test case confirms it's -Wvexing-parse:
warning: empty parentheses interpreted as a function declaration.)

This is a common mistake to make so it would be a useful
warning to add to GCC as well.  And in fact, I see at least
one request for it in Bugzilla: 25814 (with the related
pr86564).

Martin


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Jeff Law
On 6/27/19 9:00 AM, Jeff Law wrote:
> On 6/26/19 8:40 PM, Martin Sebor wrote:
>> On 6/26/19 4:31 PM, Jeff Law wrote:
>>> On 6/25/19 5:03 PM, Martin Sebor wrote:
>>>

 The caller ensures that handle_char_store is only called for stores
 to arrays (MEM_REF) or single elements as wide as char.
>>> Where?  I don't see it, even after fixing the formatting in
>>> strlen_check_and_optimize_stmt :-)
>>>
    gimple *stmt = gsi_stmt (*gsi);

    if (is_gimple_call (stmt))
>>>
>>> I think we can agree that we don't have a call, so this is false.
>>>
   else if (is_gimple_assign (stmt) && !gimple_clobber_p (stmt))
  {
    tree lhs = gimple_assign_lhs (stmt);
>>> This should be true IIUC, so we'll go into its THEN block.
>>>
>>>
    if (TREE_CODE (lhs) == SSA_NAME && POINTER_TYPE_P (TREE_TYPE
 (lhs)))
>>> Should be false.
>>>
    else if (TREE_CODE (lhs) == SSA_NAME && INTEGRAL_TYPE_P
 (TREE_TYPE (lhs)))
>>>
>>> Should also be false.
>>>
    else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
>>> Should be true since LHS is a MEM_REF.
>>>
>>>
     {
    tree type = TREE_TYPE (lhs);
    if (TREE_CODE (type) == ARRAY_TYPE)
  type = TREE_TYPE (type);
    if (TREE_CODE (type) == INTEGER_TYPE
    && TYPE_MODE (type) == TYPE_MODE (char_type_node)
    && TYPE_PRECISION (type) == TYPE_PRECISION
 (char_type_node))
  {
    if (! handle_char_store (gsi))
  return false;
  }
  }
>>> If TREE_TYPE (type) is an ARRAY_TYPE, we strip the ARRAY_TYPE.  We then
>>> verify that TYPE is a single character type.  _But_ we stripped away the
>>> ARRAY_TYPE.  So ISTM that we allow either an array of chars or a single
>>> char on the LHS.
>>>
>>> So how does this avoid multiple character stores?!?  We could have had
>>> an ARRAY_REF on the LHS and we never check the number of elements in the
>>> array.  There's no check on the RHS either.  SO I don't see how we
>>> guarantee that we're dealing with a single character store.
>>>
>>> What am I missing here?
>>
>> Can you show me a test case where it breaks?  If not, I don't know
>> what you want me to do.  I'll just move on to something else.
> THe thing to do is research what gimple accepts and what other passes
> may do.  Given this is a code generation bug, "just moving on" isn't
> really a good option unless we're going to rip out that little bit of code.
> 
> As I was thinking about this last night, the pass we'd want to look at
> would be store merging.  Let's do that on the call today.
Actually it was trivial to create with store merging.

char x[20];
foo()
{
  x[0] = 0x41;
  x[1] = 0x42;
}

  MEM  [(char *)] = 16961;

So clearly we can get this in gimple.  So we need a check of some kind,
either on the LHS or the RHS to ensure that we actually have a single
character store as opposed to something like the above.

jeff


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-27 Thread Jeff Law
On 6/26/19 8:40 PM, Martin Sebor wrote:
> On 6/26/19 4:31 PM, Jeff Law wrote:
>> On 6/25/19 5:03 PM, Martin Sebor wrote:
>>
>>>
>>> The caller ensures that handle_char_store is only called for stores
>>> to arrays (MEM_REF) or single elements as wide as char.
>> Where?  I don't see it, even after fixing the formatting in
>> strlen_check_and_optimize_stmt :-)
>>
>>>    gimple *stmt = gsi_stmt (*gsi);
>>>
>>>    if (is_gimple_call (stmt))
>>
>> I think we can agree that we don't have a call, so this is false.
>>
>>>   else if (is_gimple_assign (stmt) && !gimple_clobber_p (stmt))
>>>  {
>>>    tree lhs = gimple_assign_lhs (stmt);
>> This should be true IIUC, so we'll go into its THEN block.
>>
>>
>>>    if (TREE_CODE (lhs) == SSA_NAME && POINTER_TYPE_P (TREE_TYPE
>>> (lhs)))
>> Should be false.
>>
>>>    else if (TREE_CODE (lhs) == SSA_NAME && INTEGRAL_TYPE_P
>>> (TREE_TYPE (lhs)))
>>
>> Should also be false.
>>
>>>    else if (TREE_CODE (lhs) != SSA_NAME && !TREE_SIDE_EFFECTS (lhs))
>> Should be true since LHS is a MEM_REF.
>>
>>
>>>     {
>>>    tree type = TREE_TYPE (lhs);
>>>    if (TREE_CODE (type) == ARRAY_TYPE)
>>>  type = TREE_TYPE (type);
>>>    if (TREE_CODE (type) == INTEGER_TYPE
>>>    && TYPE_MODE (type) == TYPE_MODE (char_type_node)
>>>    && TYPE_PRECISION (type) == TYPE_PRECISION
>>> (char_type_node))
>>>  {
>>>    if (! handle_char_store (gsi))
>>>  return false;
>>>  }
>>>  }
>> If TREE_TYPE (type) is an ARRAY_TYPE, we strip the ARRAY_TYPE.  We then
>> verify that TYPE is a single character type.  _But_ we stripped away the
>> ARRAY_TYPE.  So ISTM that we allow either an array of chars or a single
>> char on the LHS.
>>
>> So how does this avoid multiple character stores?!?  We could have had
>> an ARRAY_REF on the LHS and we never check the number of elements in the
>> array.  There's no check on the RHS either.  SO I don't see how we
>> guarantee that we're dealing with a single character store.
>>
>> What am I missing here?
> 
> Can you show me a test case where it breaks?  If not, I don't know
> what you want me to do.  I'll just move on to something else.
THe thing to do is research what gimple accepts and what other passes
may do.  Given this is a code generation bug, "just moving on" isn't
really a good option unless we're going to rip out that little bit of code.

As I was thinking about this last night, the pass we'd want to look at
would be store merging.  Let's do that on the call today.

jeff


Re: [PATCH] Fix 2 clang warnings.

2019-06-27 Thread Martin Sebor

On 6/27/19 8:21 AM, Martin Liška wrote:

On 6/27/19 4:11 PM, Jakub Jelinek wrote:

On Thu, Jun 27, 2019 at 04:03:06PM +0200, Martin Liška wrote:

* ggc-page.c (free_page): Use (char *) for %p printf format
argument.



--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -977,7 +977,7 @@ free_page (page_entry *entry)
if (GGC_DEBUG_LEVEL >= 2)
  fprintf (G.debug_file,
 "Deallocating page at %p, data %p-%p\n", (void *) entry,
-entry->page, entry->page + entry->bytes - 1);
+(char *)entry->page, (char *)entry->page + entry->bytes - 1);
  
/* Mark the page as inaccessible.  Discard the handle to avoid handle

   leak.  */


Can you explain this?  It makes no sense to me.  What is the warning?
entry->page already has char * type, so why are any casts needed?
If you want to be pedantic, C says that %p argument should be pointer to
void, so I'd understand more
 (void *) entry->page, (void *) (entry->page + entry->bytes - 1));
with that formatting.


You are right, it should be the other way around:

/home/marxin/Programming/gcc/gcc/ggc-page.c:946:60: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]


I'm normally in favor of cleaning up warnings but this one looks
worse than useless to me: heeding it makes what's clean, easy t
read, and perfectly correct and safe code harder to read purely
for the sake of pedantry.  There is no difference between a char*
and void*.  If we want to make an improvement let's propose to
relax the pointless C requirement that the %p argument be a void*
and allow it to be any object pointer.

Martin


  (void *) entry, (unsigned long) OBJECT_SIZE (order), page,
   ^~~~
/home/marxin/Programming/gcc/gcc/ggc-page.c:947:7: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
  page + entry_size - 1);
  ^
/home/marxin/Programming/gcc/gcc/ggc-page.c:980:7: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
  entry->page, entry->page + entry->bytes - 1);
  ^~~
/home/marxin/Programming/gcc/gcc/ggc-page.c:980:20: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
  entry->page, entry->page + entry->bytes - 1);
   ^~

Martin



Jakub







Re: [PATCH] Fix 2 clang warnings.

2019-06-27 Thread Jakub Jelinek
On Thu, Jun 27, 2019 at 04:21:10PM +0200, Martin Liška wrote:
> 2019-06-27  Martin Liska  
> 
>   * edit-context.c (test_applying_fixits_unreadable_file): Do not
>   use () for a constructor call.
>   (test_applying_fixits_line_out_of_range): Likewise.
>   * ggc-page.c (alloc_page): Use (void *) for %p printf format
>   argument.
>   (free_page): Likewise.

LGTM.

Jakub


Re: [PATCH] Deprecate -frepo option.

2019-06-27 Thread Martin Liška
On 6/27/19 2:58 PM, Jonathan Wakely wrote:
> On Thu, 27 Jun 2019 at 13:30, Martin Liška  wrote:
>>
>> On 6/21/19 4:28 PM, Richard Biener wrote:
>>> On Fri, Jun 21, 2019 at 4:13 PM Jakub Jelinek  wrote:

 On Fri, Jun 21, 2019 at 04:04:00PM +0200, Martin Liška wrote:
> On 6/21/19 1:58 PM, Jakub Jelinek wrote:
>> On Fri, Jun 21, 2019 at 01:52:09PM +0200, Martin Liška wrote:
>>> On 6/21/19 1:47 PM, Jonathan Wakely wrote:
 On Fri, 21 Jun 2019 at 11:40, Martin Liška wrote:
> Yes, I would be fine to deprecate that for GCC 10.1

 Would it be appropriate to issue a warning in GCC 10.x if the option 
 is used?
>>>
>>> Sure. With the patch attached one will see:
>>>
>>> $ gcc -frepo /tmp/main.cc -c
>>> gcc: warning: switch ‘-frepo’ is no longer supported
>>>
>>> I'm sending patch that also removes -frepo tests from test-suite.
>>> I've been testing the patch.
>>
>> IMHO for just deprecation of an option you don't want to remove it from 
>> the
>> testsuite, just match the warning it will generate in those tests, and
>> I'm not convinced you want to remove it from the documentation (rather 
>> than
>> just saying in the documentation that the option is deprecated and might 
>> be
>> removed in a later GCC version).
>
> Agree with you. I'm sending updated version of the patch.
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

 I'm also not convinced about the Deprecated flag, seems like that is a flag
 that we use for options that have been already removed.
 So, instead there should be some proper warning in the C++ FE for it,
 or just Warn.
>>>
>>> In principle -frepo is a nice idea - does it live up to its promises?  That 
>>> is,
>>> does it actually work, for example when throwing it on the libstdc++
>>> testsuite or a larger C++ project?
>>
>> @Jonathan, Jason: Do we know whether it really work?
> 
> I don't know. It's nearly 20 years since I've tried it, but apparently
> a few people try using it:
> https://stackoverflow.com/search?q=frepo+c%2B%2B
> 
> The first result was answered by me in 2012 saying nobody uses it:
> https://stackoverflow.com/a/11832613/981959
> 

Looks at this, it seems to me very legacy and I would recommend to remove it.

Martin


Re: [PATCH] Fix 2 clang warnings.

2019-06-27 Thread Martin Liška
On 6/27/19 4:11 PM, Jakub Jelinek wrote:
> On Thu, Jun 27, 2019 at 04:03:06PM +0200, Martin Liška wrote:
>>  * ggc-page.c (free_page): Use (char *) for %p printf format
>>  argument.
> 
>> --- a/gcc/ggc-page.c
>> +++ b/gcc/ggc-page.c
>> @@ -977,7 +977,7 @@ free_page (page_entry *entry)
>>if (GGC_DEBUG_LEVEL >= 2)
>>  fprintf (G.debug_file,
>>   "Deallocating page at %p, data %p-%p\n", (void *) entry,
>> - entry->page, entry->page + entry->bytes - 1);
>> + (char *)entry->page, (char *)entry->page + entry->bytes - 1);
>>  
>>/* Mark the page as inaccessible.  Discard the handle to avoid handle
>>   leak.  */
> 
> Can you explain this?  It makes no sense to me.  What is the warning?
> entry->page already has char * type, so why are any casts needed?
> If you want to be pedantic, C says that %p argument should be pointer to
> void, so I'd understand more
>(void *) entry->page, (void *) (entry->page + entry->bytes - 1));
> with that formatting.

You are right, it should be the other way around:

/home/marxin/Programming/gcc/gcc/ggc-page.c:946:60: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
 (void *) entry, (unsigned long) OBJECT_SIZE (order), page,
  ^~~~
/home/marxin/Programming/gcc/gcc/ggc-page.c:947:7: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
 page + entry_size - 1);
 ^
/home/marxin/Programming/gcc/gcc/ggc-page.c:980:7: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
 entry->page, entry->page + entry->bytes - 1);
 ^~~
/home/marxin/Programming/gcc/gcc/ggc-page.c:980:20: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
 entry->page, entry->page + entry->bytes - 1);
  ^~

Martin

> 
>   Jakub
> 

>From 648388767ade0f7de683e79dca3fdcda3f2acda6 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 27 Jun 2019 14:09:50 +0200
Subject: [PATCH] Fix 2 clang warnings.

gcc/ChangeLog:

2019-06-27  Martin Liska  

	* edit-context.c (test_applying_fixits_unreadable_file): Do not
	use () for a constructor call.
	(test_applying_fixits_line_out_of_range): Likewise.
	* ggc-page.c (alloc_page): Use (void *) for %p printf format
	argument.
	(free_page): Likewise.
---
 gcc/edit-context.c | 4 ++--
 gcc/ggc-page.c | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/edit-context.c b/gcc/edit-context.c
index d3246ab0334..93d10664ae9 100644
--- a/gcc/edit-context.c
+++ b/gcc/edit-context.c
@@ -1639,7 +1639,7 @@ static void
 test_applying_fixits_unreadable_file ()
 {
   const char *filename = "this-does-not-exist.txt";
-  line_table_test ltt ();
+  line_table_test ltt;
   linemap_add (line_table, LC_ENTER, false, filename, 1);
 
   location_t loc = linemap_position_for_column (line_table, 1);
@@ -1670,7 +1670,7 @@ test_applying_fixits_line_out_of_range ()
   const char *old_content = "One-liner file\n";
   temp_source_file tmp (SELFTEST_LOCATION, ".txt", old_content);
   const char *filename = tmp.get_filename ();
-  line_table_test ltt ();
+  line_table_test ltt;
   linemap_add (line_table, LC_ENTER, false, filename, 2);
 
   /* Try to insert a string in line 2.  */
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 7066ef2c488..a95ff466704 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -943,8 +943,8 @@ alloc_page (unsigned order)
   if (GGC_DEBUG_LEVEL >= 2)
 fprintf (G.debug_file,
 	 "Allocating page at %p, object size=%lu, data %p-%p\n",
-	 (void *) entry, (unsigned long) OBJECT_SIZE (order), page,
-	 page + entry_size - 1);
+	 (void *) entry, (unsigned long) OBJECT_SIZE (order),
+	 (void *) page, (void *) (page + entry_size - 1));
 
   return entry;
 }
@@ -977,7 +977,7 @@ free_page (page_entry *entry)
   if (GGC_DEBUG_LEVEL >= 2)
 fprintf (G.debug_file,
 	 "Deallocating page at %p, data %p-%p\n", (void *) entry,
-	 entry->page, entry->page + entry->bytes - 1);
+	 (void *) entry->page, (void *) (entry->page + entry->bytes - 1));
 
   /* Mark the page as inaccessible.  Discard the handle to avoid handle
  leak.  */
-- 
2.21.0



Re: [PATCH 21/30] Changes to pdp11

2019-06-27 Thread Paul Koning



> On Jun 25, 2019, at 4:22 PM, acsaw...@linux.ibm.com wrote:
> 
> From: Aaron Sawdey 
> 
>   * config/pdp11/pdp11.md (movmemhi, movmemhi1,
>   movmemhi_nocc, UNSPEC_MOVMEM): Change movmem to cpymem.

Ok, thanks.

paul




Re: [PATCH] Fix 2 clang warnings.

2019-06-27 Thread Jakub Jelinek
On Thu, Jun 27, 2019 at 04:03:06PM +0200, Martin Liška wrote:
>   * ggc-page.c (free_page): Use (char *) for %p printf format
>   argument.

> --- a/gcc/ggc-page.c
> +++ b/gcc/ggc-page.c
> @@ -977,7 +977,7 @@ free_page (page_entry *entry)
>if (GGC_DEBUG_LEVEL >= 2)
>  fprintf (G.debug_file,
>"Deallocating page at %p, data %p-%p\n", (void *) entry,
> -  entry->page, entry->page + entry->bytes - 1);
> +  (char *)entry->page, (char *)entry->page + entry->bytes - 1);
>  
>/* Mark the page as inaccessible.  Discard the handle to avoid handle
>   leak.  */

Can you explain this?  It makes no sense to me.  What is the warning?
entry->page already has char * type, so why are any casts needed?
If you want to be pedantic, C says that %p argument should be pointer to
void, so I'd understand more
 (void *) entry->page, (void *) (entry->page + entry->bytes - 1));
with that formatting.

Jakub


[PATCH] Remove another bunch of dead assignment.

2019-06-27 Thread Martin Liška
Hi.

The patch continues to remove quite obvious dead assignments
that are not used.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-27  Martin Liska  

* config/i386/i386-expand.c (ix86_expand_sse2_mulvxdi3): Remove
dead assignemts.
* lra-eliminations.c (eliminate_regs_in_insn): Likewise.
* reg-stack.c (check_asm_stack_operands): Likewise.
* tree-ssa-structalias.c (create_function_info_for): Likewise.
* tree-vect-generic.c (expand_vector_operations_1): Likewise.

gcc/c-family/ChangeLog:

2019-06-27  Martin Liska  

* c-common.c (try_to_locate_new_include_insertion_point): Remove
dead assignemts.

gcc/cp/ChangeLog:

2019-06-27  Martin Liska  

* call.c (build_new_op_1): Remove
dead assignemts.
* typeck.c (cp_build_binary_op): Likewise.

gcc/fortran/ChangeLog:

2019-06-27  Martin Liska  

* check.c (gfc_check_c_funloc): Remove
dead assignemts.
* decl.c (variable_decl): Likewise.
* resolve.c (resolve_typebound_function): Likewise.
* simplify.c (gfc_simplify_matmul): Likewise.
(gfc_simplify_scan): Likewise.
* trans-array.c (gfc_could_be_alias): Likewise.
* trans-common.c (add_equivalences): Likewise.
* trans-expr.c (trans_class_vptr_len_assignment): Likewise.
(gfc_trans_array_constructor_copy): Likewise.
(gfc_trans_assignment_1): Likewise.
* trans-intrinsic.c (conv_intrinsic_atomic_op): Likewise.
* trans-openmp.c (gfc_omp_finish_clause): Likewise.
* trans-types.c (gfc_get_array_descriptor_base): Likewise.
* trans.c (gfc_build_final_call): Likewise.

libcpp/ChangeLog:

2019-06-27  Martin Liska  

* line-map.c (linemap_get_expansion_filename): Remove
dead assignemts.
* mkdeps.c (make_write): Likewise.
---
 gcc/c-family/c-common.c   |  4 ++--
 gcc/config/i386/i386-expand.c |  3 +--
 gcc/cp/call.c |  2 +-
 gcc/cp/typeck.c   |  1 -
 gcc/fortran/check.c   | 18 +++---
 gcc/fortran/decl.c|  1 -
 gcc/fortran/resolve.c |  1 -
 gcc/fortran/simplify.c| 27 ---
 gcc/fortran/trans-array.c |  2 --
 gcc/fortran/trans-common.c|  6 ++
 gcc/fortran/trans-expr.c  |  6 --
 gcc/fortran/trans-intrinsic.c |  1 -
 gcc/fortran/trans-openmp.c|  1 -
 gcc/fortran/trans-types.c | 10 +-
 gcc/fortran/trans.c   |  3 ---
 gcc/lra-eliminations.c|  2 +-
 gcc/reg-stack.c   |  1 -
 gcc/tree-ssa-structalias.c|  1 -
 gcc/tree-vect-generic.c   |  2 --
 libcpp/line-map.c |  3 +--
 libcpp/mkdeps.c   |  2 +-
 21 files changed, 33 insertions(+), 64 deletions(-)


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index da4aadbc590..cb92710f2bc 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -8601,8 +8601,8 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
 
   /*  Get ordinary map containing LOC (or its expansion).  */
   const line_map_ordinary *ord_map_for_loc = NULL;
-  loc = linemap_resolve_location (line_table, loc, LRK_MACRO_EXPANSION_POINT,
-  _map_for_loc);
+  linemap_resolve_location (line_table, loc, LRK_MACRO_EXPANSION_POINT,
+			_map_for_loc);
   gcc_assert (ord_map_for_loc);
 
   for (unsigned int i = 0; i < LINEMAPS_ORDINARY_USED (line_table); i++)
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index d50b811d863..1bd251ea8e2 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -19780,8 +19780,7 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
   emit_insn (gen_vec_widen_umult_even_v4si (t5, 
 	gen_lowpart (V4SImode, op1),
 	gen_lowpart (V4SImode, op2)));
-  op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
-
+  expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
 }
   else
 {
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index e4923f4ccbf..07093255505 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6167,7 +6167,7 @@ build_new_op_1 (const op_location_t , enum tree_code code, int flags,
 	  conv = cand->convs[2];
 	  if (conv->kind == ck_ref_bind)
 		conv = next_conversion (conv);
-	  arg3 = convert_like (conv, arg3, complain);
+	  convert_like (conv, arg3, complain);
 	}
 
 	}
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index dd76ebe3dbf..77095953134 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -5218,7 +5218,6 @@ cp_build_binary_op (const op_location_t ,
 	}
 	  result_type = build_opaque_vector_type (intt,
 		  TYPE_VECTOR_SUBPARTS (type0));
-	  converted = 1;
 	  return build_vec_cmp (resultcode, result_type, op0, op1);
 	}
   build_type = boolean_type_node;
diff --git a/gcc/fortran/check.c 

[PATCH] Fix 2 clang warnings.

2019-06-27 Thread Martin Liška
Hi.

This reduces 2 warnings reported by clang.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-27  Martin Liska  

* edit-context.c (test_applying_fixits_unreadable_file): Do not
use () for a constructor call.
(test_applying_fixits_line_out_of_range): Likewise.
* ggc-page.c (free_page): Use (char *) for %p printf format
argument.
---
 gcc/edit-context.c | 4 ++--
 gcc/ggc-page.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)


diff --git a/gcc/edit-context.c b/gcc/edit-context.c
index d3246ab0334..93d10664ae9 100644
--- a/gcc/edit-context.c
+++ b/gcc/edit-context.c
@@ -1639,7 +1639,7 @@ static void
 test_applying_fixits_unreadable_file ()
 {
   const char *filename = "this-does-not-exist.txt";
-  line_table_test ltt ();
+  line_table_test ltt;
   linemap_add (line_table, LC_ENTER, false, filename, 1);
 
   location_t loc = linemap_position_for_column (line_table, 1);
@@ -1670,7 +1670,7 @@ test_applying_fixits_line_out_of_range ()
   const char *old_content = "One-liner file\n";
   temp_source_file tmp (SELFTEST_LOCATION, ".txt", old_content);
   const char *filename = tmp.get_filename ();
-  line_table_test ltt ();
+  line_table_test ltt;
   linemap_add (line_table, LC_ENTER, false, filename, 2);
 
   /* Try to insert a string in line 2.  */
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 7066ef2c488..602dd7c61b7 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -977,7 +977,7 @@ free_page (page_entry *entry)
   if (GGC_DEBUG_LEVEL >= 2)
 fprintf (G.debug_file,
 	 "Deallocating page at %p, data %p-%p\n", (void *) entry,
-	 entry->page, entry->page + entry->bytes - 1);
+	 (char *)entry->page, (char *)entry->page + entry->bytes - 1);
 
   /* Mark the page as inaccessible.  Discard the handle to avoid handle
  leak.  */



Re: [PATCH][gcc] libgccjit: add bitfield support

2019-06-27 Thread Andrea Corallo
Hi Dave,
last version for this patch addressing the suggestion about the
JIT_BIT_FIELD macros comment description.

Thank you for all the suggestions.

Regarding the write access please see my previous answer into the binary
op patch thread.

Bests
  Andrea


2019-06-20  Andrea Corallo andrea.cora...@arm.com

* docs/topics/compatibility.rst (LIBGCCJIT_ABI_12): New ABI tag.
* docs/topics/types.rst: Add gcc_jit_context_new_bitfield.
* jit-common.h (namespace recording): Add class bitfield.
* jit-playback.c:
(DECL_JIT_BIT_FIELD, SET_DECL_JIT_BIT_FIELD): Add macros.
(playback::context::new_bitfield): New method.
(playback::compound_type::set_fields): Add bitfield support.
(playback::lvalue::mark_addressable): Was jit_mark_addressable make this
a method of lvalue plus return a bool to communicate success.
(playback::lvalue::get_address): Check for jit_mark_addressable return
value.
* jit-playback.h (new_bitfield): New method.
(class bitfield): New class.
(class lvalue): Add jit_mark_addressable method.
* jit-recording.c (recording::context::new_bitfield): New method.
(recording::bitfield::replay_into): New method.
(recording::bitfield::write_to_dump): Likewise.
(recording::bitfield::make_debug_string): Likewise.
(recording::bitfield::write_reproducer): Likewise.
* jit-recording.h (class context): Add new_bitfield method.
(class field): Make it derivable by class bitfield.
(class bitfield): Add new class.
* libgccjit++.h (class context): Add new_bitfield method.
* libgccjit.c (struct gcc_jit_bitfield): New structure.
(gcc_jit_context_new_bitfield): New function.
* libgccjit.h
(LIBGCCJIT_HAVE_gcc_jit_context_new_bitfield) New macro.
(gcc_jit_context_new_bitfield): New function.
* libgccjit.map (LIBGCCJIT_ABI_12) New ABI tag.


2019-06-20  Andrea Corallo andrea.cora...@arm.com

* jit.dg/all-non-failing-tests.h: Add test-accessing-bitfield.c.
* jit.dg/test-accessing-bitfield.c: New testcase.
* jit.dg/test-error-gcc_jit_context_new_bitfield-invalid-type.c:
Likewise.
* jit.dg/test-error-gcc_jit_context_new_bitfield-invalid-width.c:
Likewise.
* jit.dg/test-error-gcc_jit_lvalue_get_address-bitfield.c:
Likewise.
diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst
index abefa56..da64920 100644
--- a/gcc/jit/docs/topics/compatibility.rst
+++ b/gcc/jit/docs/topics/compatibility.rst
@@ -177,3 +177,8 @@ entrypoints:
 
 ``LIBGCCJIT_ABI_11`` covers the addition of
 :func:`gcc_jit_context_add_driver_option`
+
+``LIBGCCJIT_ABI_12``
+
+``LIBGCCJIT_ABI_12`` covers the addition of
+:func:`gcc_jit_context_new_bitfield`
diff --git a/gcc/jit/docs/topics/types.rst b/gcc/jit/docs/topics/types.rst
index 1d2dcd4..37d9d01 100644
--- a/gcc/jit/docs/topics/types.rst
+++ b/gcc/jit/docs/topics/types.rst
@@ -247,6 +247,30 @@ You can model C `struct` types by creating :c:type:`gcc_jit_struct *` and
underlying string, so it is valid to pass in a pointer to an on-stack
buffer.
 
+.. function:: gcc_jit_field *\
+  gcc_jit_context_new_bitfield (gcc_jit_context *ctxt,\
+gcc_jit_location *loc,\
+gcc_jit_type *type,\
+int width,\
+const char *name)
+
+   Construct a new bit field, with the given type width and name.
+
+   The parameter ``name`` must be non-NULL.  The call takes a copy of the
+   underlying string, so it is valid to pass in a pointer to an on-stack
+   buffer.
+
+   The parameter ``type`` must be an integer type.
+
+   The parameter ``width`` must be a positive integer that does not exceed the
+   size of ``type``.
+
+   This API entrypoint was added in :ref:`LIBGCCJIT_ABI_12`; you can test
+   for its presence using
+   .. code-block:: c
+
+  #ifdef LIBGCCJIT_HAVE_gcc_jit_context_new_bitfield
+
 .. function:: gcc_jit_object *\
   gcc_jit_field_as_object (gcc_jit_field *field)
 
diff --git a/gcc/jit/jit-common.h b/gcc/jit/jit-common.h
index 1d96cc3..e747d96 100644
--- a/gcc/jit/jit-common.h
+++ b/gcc/jit/jit-common.h
@@ -119,6 +119,7 @@ namespace recording {
 	class union_;
   class vector_type;
 class field;
+  class bitfield;
 class fields;
 class function;
 class block;
diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
index bc4de9c..d4b148e 100644
--- a/gcc/jit/jit-playback.h
+++ b/gcc/jit/jit-playback.h
@@ -75,6 +75,12 @@ public:
 	 type *type,
 	 const char *name);
 
+  field *
+  new_bitfield (location *loc,
+		type *type,
+		int width,
+		const char *name);
+
   compound_type *
   new_compound_type (location *loc,
 		 const char *name,
@@ -426,6 +432,8 @@ private:
   tree m_inner;
 };
 
+class bitfield : public field {};
+
 class function : public wrapper
 {
 public:
@@ -614,6 +622,8 @@ public:
   rvalue *
   get_address (location *loc);
 
+private:
+  bool mark_addressable (location *loc);
 };
 

RE: Use ODR for canonical types construction in LTO

2019-06-27 Thread JiangNing OS
No. Since this is LTO, it's very hard to simplify the big application. Sorry 
for that.

I think Christophe is mentioning the case from g++.dg is reporting the similar 
issue like " type variant differs by TYPE_CXX_ODR_P  ", right?

Thanks,
-Jiangning

> -Original Message-
> From: Jan Hubicka 
> Sent: Thursday, June 27, 2019 2:29 PM
> To: JiangNing OS 
> Cc: Eric Botcazou ; Christophe Lyon
> ; gcc Patches ;
> Richard Biener ; d...@dcepelik.cz; Martin Liška
> 
> Subject: Re: Use ODR for canonical types construction in LTO
> 
> > Hi,
> >
> > This commit
> > https://gcc.gnu.org/viewcvs/gcc?view=revision=272628 is
> > breaking trunk LTO on some real benchmarks, so can it be fixed or
> > reverted? For example,
> 
> Do you have a testcase?
> Honza
> >
> > lto1: error: type variant differs by TYPE_CXX_ODR_P   > 0x99943d08 __va_list BLK
> > size  bitsizetype> constant 256>
> > unit-size  sizetype> constant 32>
> > align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
> 0x99943d08
> > fields  > type  > void>
> > public unsigned DI
> > size 
> > unit-size 
> > align:64 warn_if_not_align:0 symtab:0 alias-set 7 
> > structural-equality
> > pointer_to_this >
> > unsigned DI :0:0 size  
> > unit-size
> 
> > align:64 warn_if_not_align:0 offset_align 128
> > offset 
> > bit-offset  context
> 
> > chain  0x99940fc0>
> > unsigned DI :0:0 size  
> > unit-
> size 
> > align:64 warn_if_not_align:0 offset_align 128 offset 
> >  0x99890c30 0> bit-offset  context
>  chain  __vr_top>>>
> > reference_to_this  chain  > 0x99960098 __va_list>>   BLK
> > size  bitsizetype> constant 256>
> > unit-size  sizetype> constant 32>
> > align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
> 0x99943d08
> > fields  > type  > void>
> > public unsigned DI
> > size 
> > unit-size 
> > align:64 warn_if_not_align:0 symtab:0 alias-set 7 
> > structural-equality
> > pointer_to_this >
> > unsigned DI :0:0 size  
> > unit-size
> 
> > align:64 warn_if_not_align:0 offset_align 128
> > offset 
> > bit-offset  context
> 
> > chain  0x99940fc0>
> > unsigned DI :0:0 size  
> > unit-
> size 
> > align:64 warn_if_not_align:0 offset_align 128 offset 
> >  0x99890c30 0> bit-offset  context
>  chain  __vr_top>>>
> > pointer_to_this  reference_to_this
> > >
> > lto1: internal compiler error: 'verify_type' failed
> > 0xe33e93 verify_type(tree_node const*)
> > ../../gcc/gcc/tree.c:14655
> > 0x5efc4b lto_fixup_state
> > ../../gcc/gcc/lto/lto-common.c:2429
> > 0x5fc01b lto_fixup_decls
> > ../../gcc/gcc/lto/lto-common.c:2460
> > 0x5fc01b read_cgraph_and_symbols(unsigned int, char const**)
> > ../../gcc/gcc/lto/lto-common.c:2693
> > 0x5ded23 lto_main()
> > ../../gcc/gcc/lto/lto.c:616
> > Please submit a full bug report,
> > with preprocessed source if appropriate.
> > Please include the complete backtrace with any bug report.
> > See  for instructions.
> > lto-wrapper: fatal error: /home/amptest/gcc/install_last//bin/g++
> > returned 1 exit status compilation terminated.
> > /usr/bin/ld: error: lto-wrapper failed
> > collect2: error: ld returned 1 exit status
> >
> > Thanks,
> > -Jiangning
> >
> > > -Original Message-
> > > From: gcc-patches-ow...@gcc.gnu.org 
> > > On Behalf Of Christophe Lyon
> > > Sent: Tuesday, June 25, 2019 8:30 PM
> > > To: Jan Hubicka 
> > > Cc: Eric Botcazou ; gcc Patches  > > patc...@gcc.gnu.org>; Richard Biener ;
> > > d...@dcepelik.cz; Martin Liška 
> > > Subject: Re: Use ODR for canonical types construction in LTO
> > >
> > > Hi,
> > >
> > >
> > > On Tue, 25 Jun 2019 at 10:20, Jan Hubicka  wrote:
> > > >
> > > > > > * gcc-interface/decl.c (gnat_to_gnu_entity): Check that
> > > > > > type is array or integer prior checking string flag.
> > > > >
> > > > > The test for array is superfluous here.
> > > > >
> > > > > > * gcc-interface/gigi.h (gnat_signed_type_for,
> > > > > > maybe_character_value): Likewise.
> > > > >
> > > > > Wrong ChangeLog, the first modified function is
> maybe_character_type.
> > > > >
> > > > > I have installed the attached patchlet after testing it on 
> > > > > x86-64/Linux.
> > > > >
> > > > >
> > > > >   * gcc-interface/decl.c (gnat_to_gnu_entity): Remove
> > > > > superfluous test
> > > in
> > > > >   previous change.
> > > > >   * gcc-interface/gigi.h (maybe_character_type): Fix formatting.
> > > > >   (maybe_character_value): Likewise.
> > > >
> > > > Thanks a lot. I was not quite sure if ARRAY_TYPEs can happen there
> > > > and I should have added you to the CC.
> > > >
> > >
> > > After the main commit (r272628), I have noticed regressions on arm

Re: [PATCH] Handle '\0' in strcmp in RTL expansion (PR tree-optimization/90892).

2019-06-27 Thread Martin Liška
On 6/18/19 12:16 PM, Jakub Jelinek wrote:
> I think any such length changes should be moved after the two punt checks.
> Move also the len3 setting before the new checks (of course conditional on
> is_ncmp).

Ok, I'm sending updated version of the patch that addresses this.
I've been testing the patch.

Ready to be installed then?
Thanks,
Martin
>From 35043ab431c0dc1e8dcda484725a1f8875a4b95b Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 17 Jun 2019 10:39:15 +0200
Subject: [PATCH] Handle '\0' in strcmp in RTL expansion (PR
 tree-optimization/90892).

gcc/ChangeLog:

2019-06-17  Martin Liska  

	PR tree-optimization/90892
	* builtins.c (inline_expand_builtin_string_cmp): Handle '\0'
	in string constants.

gcc/testsuite/ChangeLog:

2019-06-17  Martin Liska  

	PR tree-optimization/90892
	* gcc.dg/pr90892.c: New test.
---
 gcc/builtins.c | 17 ++---
 gcc/testsuite/gcc.dg/pr90892.c | 14 ++
 2 files changed, 28 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr90892.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index c53afe8b033..db7939cfde8 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7118,8 +7118,19 @@ inline_expand_builtin_string_cmp (tree exp, rtx target)
 return NULL_RTX;
 
   /* For strncmp, if the length is not a const, not qualify.  */
-  if (is_ncmp && !tree_fits_uhwi_p (len3_tree))
-return NULL_RTX;
+  if (is_ncmp)
+{
+  if (!tree_fits_uhwi_p (len3_tree))
+	return NULL_RTX;
+  else
+	len3 = tree_to_uhwi (len3_tree);
+}
+
+  if (src_str1 != NULL)
+len1 = strnlen (src_str1, len1) + 1;
+
+  if (src_str2 != NULL)
+len2 = strnlen (src_str2, len2) + 1;
 
   int const_str_n = 0;
   if (!len1)
@@ -7134,7 +7145,7 @@ inline_expand_builtin_string_cmp (tree exp, rtx target)
   gcc_checking_assert (const_str_n > 0);
   length = (const_str_n == 1) ? len1 : len2;
 
-  if (is_ncmp && (len3 = tree_to_uhwi (len3_tree)) < length)
+  if (is_ncmp && len3 < length)
 length = len3;
 
   /* If the length of the comparision is larger than the threshold,
diff --git a/gcc/testsuite/gcc.dg/pr90892.c b/gcc/testsuite/gcc.dg/pr90892.c
new file mode 100644
index 000..e4b5310807a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr90892.c
@@ -0,0 +1,14 @@
+/* PR tree-optimization/90892 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+const char *a = "A\0b";
+
+int
+main()
+{
+  if (__builtin_strncmp(a, "A\0", 2) != 0)
+__builtin_abort ();
+
+  return 0;
+}
-- 
2.21.0



Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Segher Boessenkool
On Thu, Jun 27, 2019 at 05:46:00AM -0600, Jan Beulich wrote:
> While maybe not explicitly applicable here, the intrinsics aren't
> (afaict) providing full flexibility. In particular (just as example)
> I haven't found a way to use embedded broadcast with the
> intrinsics, but I can easily do so with asm().
> 
> Furthermore there are other reasons to use asm() - things like
> the Linux kernel are full of it for a reason.

Some of it for a (good) reason.  Yes.

> And once one has
> to use asm(), the resulting code typically is easier to follow if
> one doesn't further intermix it with uses of builtins.

If you have to write more than a few lines of assembler, you are usually
*much* better off just writing it in an assembler source file (a .s).

> And finally, if asm() was indeed meant to be deprecated, how
> come it pretty recently got extended to allow for "inline"?

That was just a simple extension for a very specific purpose.  This does
not mean you should use inline assembler if there are better alternatives.
I'm not sure Linux using such extremely huge inline assembler statements
is a good idea at all, or if there are better ways to do what they want
(not that I see any fwiw); but they already did, and this takes the pain
away a bit for them, so why not.

You should not use inline assembler when there are better way of
expressing what you want.  Like with builtins, for example.  Inline
assembler is hard to write correctly, and hard to *keep* correct.  It is
mixing abstractions (that is its *purpose*), which of course makes it a
royal pain to deal with, esp. if overused.


Segher


Re: [PATCH] Deprecate -frepo option.

2019-06-27 Thread Jonathan Wakely
On Thu, 27 Jun 2019 at 13:30, Martin Liška  wrote:
>
> On 6/21/19 4:28 PM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 4:13 PM Jakub Jelinek  wrote:
> >>
> >> On Fri, Jun 21, 2019 at 04:04:00PM +0200, Martin Liška wrote:
> >>> On 6/21/19 1:58 PM, Jakub Jelinek wrote:
>  On Fri, Jun 21, 2019 at 01:52:09PM +0200, Martin Liška wrote:
> > On 6/21/19 1:47 PM, Jonathan Wakely wrote:
> >> On Fri, 21 Jun 2019 at 11:40, Martin Liška wrote:
> >>> Yes, I would be fine to deprecate that for GCC 10.1
> >>
> >> Would it be appropriate to issue a warning in GCC 10.x if the option 
> >> is used?
> >
> > Sure. With the patch attached one will see:
> >
> > $ gcc -frepo /tmp/main.cc -c
> > gcc: warning: switch ‘-frepo’ is no longer supported
> >
> > I'm sending patch that also removes -frepo tests from test-suite.
> > I've been testing the patch.
> 
>  IMHO for just deprecation of an option you don't want to remove it from 
>  the
>  testsuite, just match the warning it will generate in those tests, and
>  I'm not convinced you want to remove it from the documentation (rather 
>  than
>  just saying in the documentation that the option is deprecated and might 
>  be
>  removed in a later GCC version).
> >>>
> >>> Agree with you. I'm sending updated version of the patch.
> >>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> I'm also not convinced about the Deprecated flag, seems like that is a flag
> >> that we use for options that have been already removed.
> >> So, instead there should be some proper warning in the C++ FE for it,
> >> or just Warn.
> >
> > In principle -frepo is a nice idea - does it live up to its promises?  That 
> > is,
> > does it actually work, for example when throwing it on the libstdc++
> > testsuite or a larger C++ project?
>
> @Jonathan, Jason: Do we know whether it really work?

I don't know. It's nearly 20 years since I've tried it, but apparently
a few people try using it:
https://stackoverflow.com/search?q=frepo+c%2B%2B

The first result was answered by me in 2012 saying nobody uses it:
https://stackoverflow.com/a/11832613/981959


Re: [PATCH] Deprecate -frepo option.

2019-06-27 Thread Martin Liška
On 6/21/19 4:28 PM, Richard Biener wrote:
> On Fri, Jun 21, 2019 at 4:13 PM Jakub Jelinek  wrote:
>>
>> On Fri, Jun 21, 2019 at 04:04:00PM +0200, Martin Liška wrote:
>>> On 6/21/19 1:58 PM, Jakub Jelinek wrote:
 On Fri, Jun 21, 2019 at 01:52:09PM +0200, Martin Liška wrote:
> On 6/21/19 1:47 PM, Jonathan Wakely wrote:
>> On Fri, 21 Jun 2019 at 11:40, Martin Liška wrote:
>>> Yes, I would be fine to deprecate that for GCC 10.1
>>
>> Would it be appropriate to issue a warning in GCC 10.x if the option is 
>> used?
>
> Sure. With the patch attached one will see:
>
> $ gcc -frepo /tmp/main.cc -c
> gcc: warning: switch ‘-frepo’ is no longer supported
>
> I'm sending patch that also removes -frepo tests from test-suite.
> I've been testing the patch.

 IMHO for just deprecation of an option you don't want to remove it from the
 testsuite, just match the warning it will generate in those tests, and
 I'm not convinced you want to remove it from the documentation (rather than
 just saying in the documentation that the option is deprecated and might be
 removed in a later GCC version).
>>>
>>> Agree with you. I'm sending updated version of the patch.
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> I'm also not convinced about the Deprecated flag, seems like that is a flag
>> that we use for options that have been already removed.
>> So, instead there should be some proper warning in the C++ FE for it,
>> or just Warn.
> 
> In principle -frepo is a nice idea - does it live up to its promises?  That 
> is,
> does it actually work, for example when throwing it on the libstdc++
> testsuite or a larger C++ project? 

@Jonathan, Jason: Do we know whether it really work?

> The option doesn't document
> optimization issues but I assume template bodies are not available
> for IPA optimizations unless -frepo is combined with LTO where the
> template CU[s] then bring them in.
> 
> So I'm not sure - do we really want to remove this feature?
> 
> Richard.
> 
>> Jakub



[PATCH] Add .gnu.lto_.lto section.

2019-06-27 Thread Martin Liška
Hi.

So this is a patch candidate for the .gnu.lto_.lto ELF section.
As mentioned, the section contains information about bytecode version,
compression algorithm used and slim LTO object flag.

One minor issues is that I need to append random suffix to the section:
  [ 6] .gnu.lto_.lto.36e74acfb145b04b PROGBITS 86 
08 00   E  0   0  1

That's because of in WPA we load more object files and the hash is used for 
mapping.

Is the idea of the patch right?
Martin
>From ec550463df26855982b7b70393766cc8e7c6cc8e Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 21 Jun 2019 12:14:04 +0200
Subject: [PATCH] Add .gnu.lto_.lto section.

---
 gcc/lto-section-in.c   |  9 +++--
 gcc/lto-section-out.c  |  2 --
 gcc/lto-streamer-out.c | 40 +---
 gcc/lto-streamer.h | 25 +
 gcc/lto/lto-common.c   | 15 +++
 5 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
index 4cfc0cad4be..4e7d1181f23 100644
--- a/gcc/lto-section-in.c
+++ b/gcc/lto-section-in.c
@@ -52,10 +52,10 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
   "icf",
   "offload_table",
   "mode_table",
-  "hsa"
+  "hsa",
+  "lto"
 };
 
-
 /* Hooks so that the ipa passes can call into the lto front end to get
sections.  */
 
@@ -146,7 +146,7 @@ lto_get_section_data (struct lto_file_decl_data *file_data,
   /* WPA->ltrans streams are not compressed with exception of function bodies
  and variable initializers that has been verbatim copied from earlier
  compilations.  */
-  if (!flag_ltrans || decompress)
+  if ((!flag_ltrans || decompress) && section_type != LTO_section_lto)
 {
   /* Create a mapping header containing the underlying data and length,
 	 and prepend this to the uncompression buffer.  The uncompressed data
@@ -167,9 +167,6 @@ lto_get_section_data (struct lto_file_decl_data *file_data,
   data = buffer.data + header_length;
 }
 
-  lto_check_version (((const lto_header *)data)->major_version,
-		 ((const lto_header *)data)->minor_version,
-		 file_data->file_name);
   return data;
 }
 
diff --git a/gcc/lto-section-out.c b/gcc/lto-section-out.c
index c91e58f0465..7ae102164ef 100644
--- a/gcc/lto-section-out.c
+++ b/gcc/lto-section-out.c
@@ -285,8 +285,6 @@ lto_destroy_simple_output_block (struct lto_simple_output_block *ob)
   /* Write the header which says how to decode the pieces of the
  t.  */
   memset (, 0, sizeof (struct lto_simple_header));
-  header.major_version = LTO_major_version;
-  header.minor_version = LTO_minor_version;
   header.main_size = ob->main_stream->total_size;
   lto_write_data (, sizeof header);
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index dc68429303c..7dee770aa11 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1974,10 +1974,6 @@ produce_asm (struct output_block *ob, tree fn)
   /* The entire header is stream computed here.  */
   memset (, 0, sizeof (struct lto_function_header));
 
-  /* Write the header.  */
-  header.major_version = LTO_major_version;
-  header.minor_version = LTO_minor_version;
-
   if (section_type == LTO_section_function_body)
 header.cfg_size = ob->cfg_stream->total_size;
   header.main_size = ob->main_stream->total_size;
@@ -2270,10 +2266,6 @@ lto_output_toplevel_asms (void)
   /* The entire header stream is computed here.  */
   memset (, 0, sizeof (header));
 
-  /* Write the header.  */
-  header.major_version = LTO_major_version;
-  header.minor_version = LTO_minor_version;
-
   header.main_size = ob->main_stream->total_size;
   header.string_size = ob->string_stream->total_size;
   lto_write_data (, sizeof header);
@@ -2390,6 +2382,29 @@ prune_offload_funcs (void)
 DECL_PRESERVE_P (fn_decl) = 1;
 }
 
+/* Produce LTO section that contains global information
+   about LTO bytecode.  */
+
+static void
+produce_lto_section ()
+{
+  /* Stream LTO meta section.  */
+  output_block *ob = create_output_block (LTO_section_lto);
+
+  char * section_name = lto_get_section_name (LTO_section_lto, NULL, NULL);
+  lto_begin_section (section_name, false);
+  free (section_name);
+
+  lto_compression compression = ZLIB;
+
+  bool slim_object = flag_generate_lto && !flag_fat_lto_objects;
+  lto_section s
+= { LTO_major_version, LTO_minor_version, slim_object, compression, 0 };
+  lto_write_data (, sizeof s);
+  lto_end_section ();
+  destroy_output_block (ob);
+}
+
 /* Main entry point from the pass manager.  */
 
 void
@@ -2412,6 +2427,8 @@ lto_output (void)
   /* Initialize the streamer.  */
   lto_streamer_init ();
 
+  produce_lto_section ();
+
   n_nodes = lto_symtab_encoder_size (encoder);
   /* Process only the functions with bodies.  */
   for (i = 0; i < n_nodes; i++)
@@ -2827,10 +2844,6 @@ lto_write_mode_table (void)
   struct lto_simple_header_with_strings header;
   memset (, 0, sizeof (header));
 
-  /* Write the header.  */
-  

Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 2:23 PM Jan Beulich  wrote:
>
> >>> On 27.06.19 at 14:00,  wrote:
> > On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich  wrote:
> >>
> >> >>> On 27.06.19 at 13:09,  wrote:
> >> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich  wrote:
> >> >>
> >> >> Without these constraints asm() can't make use of mask registers.
> >> >
> >> > asm should be deprecated. We have intrinsics for this purpose.
> >>
> >> While maybe not explicitly applicable here, the intrinsics aren't
> >> (afaict) providing full flexibility. In particular (just as example)
> >> I haven't found a way to use embedded broadcast with the
> >> intrinsics, but I can easily do so with asm().
> >>
> >> Furthermore there are other reasons to use asm() - things like
> >> the Linux kernel are full of it for a reason. And once one has
> >> to use asm(), the resulting code typically is easier to follow if
> >> one doesn't further intermix it with uses of builtins.
> >>
> >> And finally, if asm() was indeed meant to be deprecated, how
> >> come it pretty recently got extended to allow for "inline"?
> >
> > I didn't mean that asm() in general should be deprecated, but for SSE
> > and other vector extensions, where intrinsics are available,
> > intrinsics should be used instead. There was exactly zero requests to
> > use new asm constraints, it looks that people are satisfied with
> > intrinsics approach (which is also future-proof, etc).
>
> So what about my embedded broadcast example then? "Zero
> requests" is clearly not exactly right. It simply didn't occur to me
> (until I noticed the @internal here) that I should raise such a
> request, rather than just using asm(). Subsequently I did then
> notice "Yh" going away, complicating things further ...

There was some work by HJ a while ago that implemented automatic
generation of embedded broadcasts. Perhaps he can answer the question.

Uros.

> I'd also like to note that the choice of types on some of the builtins
> makes it rather cumbersome to use them. Especially for scalar
> operations I've found myself better resorting to asm(). See
> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/x86_emulator/simd.c
> (most of the changes submitted (not so) recently have been
> coming from the work of putting together this and its sibling
> tests for the Xen Project instruction emulator).
>
> Jan
>
>


Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Jan Beulich
>>> On 27.06.19 at 14:00,  wrote:
> On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich  wrote:
>>
>> >>> On 27.06.19 at 13:09,  wrote:
>> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich  wrote:
>> >>
>> >> Without these constraints asm() can't make use of mask registers.
>> >
>> > asm should be deprecated. We have intrinsics for this purpose.
>>
>> While maybe not explicitly applicable here, the intrinsics aren't
>> (afaict) providing full flexibility. In particular (just as example)
>> I haven't found a way to use embedded broadcast with the
>> intrinsics, but I can easily do so with asm().
>>
>> Furthermore there are other reasons to use asm() - things like
>> the Linux kernel are full of it for a reason. And once one has
>> to use asm(), the resulting code typically is easier to follow if
>> one doesn't further intermix it with uses of builtins.
>>
>> And finally, if asm() was indeed meant to be deprecated, how
>> come it pretty recently got extended to allow for "inline"?
> 
> I didn't mean that asm() in general should be deprecated, but for SSE
> and other vector extensions, where intrinsics are available,
> intrinsics should be used instead. There was exactly zero requests to
> use new asm constraints, it looks that people are satisfied with
> intrinsics approach (which is also future-proof, etc).

So what about my embedded broadcast example then? "Zero
requests" is clearly not exactly right. It simply didn't occur to me
(until I noticed the @internal here) that I should raise such a
request, rather than just using asm(). Subsequently I did then
notice "Yh" going away, complicating things further ...

I'd also like to note that the choice of types on some of the builtins
makes it rather cumbersome to use them. Especially for scalar
operations I've found myself better resorting to asm(). See
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/x86_emulator/simd.c
(most of the changes submitted (not so) recently have been
coming from the work of putting together this and its sibling
tests for the Xen Project instruction emulator).

Jan




Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default

2019-06-27 Thread Bill Schmidt
On 6/27/19 6:45 AM, Segher Boessenkool wrote:
> On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote:
>> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt  wrote:
>>> We've done some experimenting and realized that the subject option almost
>>> always provide improved performance for Power when the loop unroller is
>>> enabled.  So this patch turns that flag on by default for us.
>> I guess it creates more freedom for combine (more single-uses) and register
>> allocation.  I wonder in which cases this might pessimize things?  I guess
>> the pre-RA scheduler might make RAs life harder with creating overlapping
>> life-ranges.
>>
>> I guess you didn't actually investigate the nature of the improvements you 
>> saw?
> It breaks the length of dependency chains by a factor equal to the unroll
> factor.  I do not know why this doesn't help a lot everywhere.  It of
> course raises register pressure, maybe that is just it?

Right, it's all about breaking dependencies to more efficiently exploit
the microarchitecture.  By default, variable expansion in GCC is quite
conservative, creating only two reduction streams out of one, so it's
pretty rare for it to cause spill.  This can be adjusted upwards with
--param max-variable-expansions-in-unroller=n.  Our experiments show
that raising n to 4 starts to cause some minor degradations, which are
almost certainly due to pressure, so the default setting looks appropriate.
>
>> Do we want to adjust the flags documentation, saying whether this is enabled
>> by default depends on the target (or even list them)?
> Good idea, thanks.

OK, I'll update the docs and make the change that Segher requested. 
Thanks for the reviews!

Bill
>
>
> Segher



Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich  wrote:
>
> >>> On 27.06.19 at 13:09,  wrote:
> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich  wrote:
> >>
> >> Without these constraints asm() can't make use of mask registers.
> >
> > asm should be deprecated. We have intrinsics for this purpose.
>
> While maybe not explicitly applicable here, the intrinsics aren't
> (afaict) providing full flexibility. In particular (just as example)
> I haven't found a way to use embedded broadcast with the
> intrinsics, but I can easily do so with asm().
>
> Furthermore there are other reasons to use asm() - things like
> the Linux kernel are full of it for a reason. And once one has
> to use asm(), the resulting code typically is easier to follow if
> one doesn't further intermix it with uses of builtins.
>
> And finally, if asm() was indeed meant to be deprecated, how
> come it pretty recently got extended to allow for "inline"?

I didn't mean that asm() in general should be deprecated, but for SSE
and other vector extensions, where intrinsics are available,
intrinsics should be used instead. There was exactly zero requests to
use new asm constraints, it looks that people are satisfied with
intrinsics approach (which is also future-proof, etc).

Uros.


Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Jan Beulich
>>> On 27.06.19 at 13:09,  wrote:
> On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich  wrote:
>>
>> Without these constraints asm() can't make use of mask registers.
> 
> asm should be deprecated. We have intrinsics for this purpose.

While maybe not explicitly applicable here, the intrinsics aren't
(afaict) providing full flexibility. In particular (just as example)
I haven't found a way to use embedded broadcast with the
intrinsics, but I can easily do so with asm().

Furthermore there are other reasons to use asm() - things like
the Linux kernel are full of it for a reason. And once one has
to use asm(), the resulting code typically is easier to follow if
one doesn't further intermix it with uses of builtins.

And finally, if asm() was indeed meant to be deprecated, how
come it pretty recently got extended to allow for "inline"?

Jan




Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default

2019-06-27 Thread Segher Boessenkool
On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote:
> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt  wrote:
> > We've done some experimenting and realized that the subject option almost
> > always provide improved performance for Power when the loop unroller is
> > enabled.  So this patch turns that flag on by default for us.
> 
> I guess it creates more freedom for combine (more single-uses) and register
> allocation.  I wonder in which cases this might pessimize things?  I guess
> the pre-RA scheduler might make RAs life harder with creating overlapping
> life-ranges.
> 
> I guess you didn't actually investigate the nature of the improvements you 
> saw?

It breaks the length of dependency chains by a factor equal to the unroll
factor.  I do not know why this doesn't help a lot everywhere.  It of
course raises register pressure, maybe that is just it?

> Do we want to adjust the flags documentation, saying whether this is enabled
> by default depends on the target (or even list them)?

Good idea, thanks.


Segher


Re: [PATCH] x86: fix/improve vgf2p8affine*qb insns

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 1:39 PM Jan Beulich  wrote:
>
> >>> On 27.06.19 at 12:58,  wrote:
> > On Thu, Jun 27, 2019 at 12:49 PM Jan Beulich  wrote:
> >>
> >> >>> On 27.06.19 at 12:20,  wrote:
> >> > On Thu, Jun 27, 2019 at 10:57 AM Jan Beulich  wrote:
> >> >>
> >> >> - the affine transformations are not commutative (the two source
> >> >>   operands have entirely different meaning)
> >> >> - there's no need for three alternatives
> >> >> - the nonimmediate_operand/Bm combination can better be vector_operand/m
> >> >>
> >> >> gcc/
> >> >> 2019-06-27  Jan Beulich  
> >> >>
> >> >> * config/i386/sse.md (vgf2p8affineinvqb_,
> >> >> vgf2p8affineqb_): Drop % constraint modifier.
> >> >> Eliminate redundant alternative.  Use vector_operand plus "m"
> >> >> constraint.
> >> >
> >> > Please just drop % modifier and use vector_operand here. IIRC,
> >> > register allocator operates on constraints, it doesn't care for
> >> > predicates. But predicates shouldn't be more constrained than
> >> > constraints. So, having "m" instead of "Bm" is a bad idea with
> >> > vector_operand.
> >>
> >> Well, putting back the Bm is easy (if that's really needed). But do
> >> you also mean me to put back to redundant 3rd alternative?
> >
> > It is not redundant, "x" and "v" are different register constraints.
>
> Well, yes, "v" is covering a wider set than "x". But only if AVX512F
> is enabled. The original combinations ("x", "avx") and ("v", "avx512f")
> are thus effectively the same as the new ("v", "avx"). But yes - I
> guess I'll split this from the actual bug fix.

No, you are correct. Please use "v", and remove the redundant alternative.

Uros.


Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Jan Hubicka
> On Thu, 27 Jun 2019, Jan Hubicka wrote:
> 
> > Hi,
> > here is update patch I am re-testing. Ok if it suceeds?
> 
> + if (!type_in_anonymous_namespace_p (t))
> +   hash = htab_hash_string (IDENTIFIER_POINTER
> +  (DECL_ASSEMBLER_NAME
> +  (TYPE_NAME (t;
> + else
> +   hash = TYPE_UID (t);
> + num_canonical_type_hash_entries++;
> + bool existed_p = canonical_type_hash_cache->put (prevail, hash);
> 
> but you still cache the hash for prevail rather than t while we
> are eventually asking for the hash of t again?  I suppose not since
> we just adjusted TYPE_CANONICAL and we're then supposed to pick
> the hash from the canonical type?  I think this deserves a comment.
> And it should use 'prevail' then everywhere here for consistency,
> otherwise it really looks odd.

Yes, all ODR variants will get prevail as their TYPE_CANONICAL and
therefore we need to cache its hash.
I will add comment.
> 
> > @@ -357,7 +367,7 @@ iterative_hash_canonical_type (tree type  
> >optimal order.  To avoid quadratic behavior also register the
> >type here.  */
> >v = hash_canonical_type (type);
> > -  gimple_register_canonical_type_1 (type, v);
> > +  v = gimple_register_canonical_type_1 (type, v);
> >  }
> >hstate.add_int (v);
> >  }
> 
> Just noticed this should be
> 
> hstate.merge_hash (v);
> 
> results in the very same effect but is clearer IMHO.

OK, will fix that too, re-test and commit.
Thanks!

Honza


Re: [PATCH] x86: fix/improve vgf2p8affine*qb insns

2019-06-27 Thread Jan Beulich
>>> On 27.06.19 at 12:58,  wrote:
> On Thu, Jun 27, 2019 at 12:49 PM Jan Beulich  wrote:
>>
>> >>> On 27.06.19 at 12:20,  wrote:
>> > On Thu, Jun 27, 2019 at 10:57 AM Jan Beulich  wrote:
>> >>
>> >> - the affine transformations are not commutative (the two source
>> >>   operands have entirely different meaning)
>> >> - there's no need for three alternatives
>> >> - the nonimmediate_operand/Bm combination can better be vector_operand/m
>> >>
>> >> gcc/
>> >> 2019-06-27  Jan Beulich  
>> >>
>> >> * config/i386/sse.md (vgf2p8affineinvqb_,
>> >> vgf2p8affineqb_): Drop % constraint modifier.
>> >> Eliminate redundant alternative.  Use vector_operand plus "m"
>> >> constraint.
>> >
>> > Please just drop % modifier and use vector_operand here. IIRC,
>> > register allocator operates on constraints, it doesn't care for
>> > predicates. But predicates shouldn't be more constrained than
>> > constraints. So, having "m" instead of "Bm" is a bad idea with
>> > vector_operand.
>>
>> Well, putting back the Bm is easy (if that's really needed). But do
>> you also mean me to put back to redundant 3rd alternative?
> 
> It is not redundant, "x" and "v" are different register constraints.

Well, yes, "v" is covering a wider set than "x". But only if AVX512F
is enabled. The original combinations ("x", "avx") and ("v", "avx512f")
are thus effectively the same as the new ("v", "avx"). But yes - I
guess I'll split this from the actual bug fix.

Jan




[PATCH] PR libstdc++/85494 use rand_s in std::random_device

2019-06-27 Thread Jonathan Wakely

This is a minimal fix for the use of a deterministic RNG on mingw-w64,
simply using rand_s unconditionally. The rest of the r271740 changes on
trunk are not included. That means that RDSEED and RDRAND are not
available for mingw-w64 and the token passed to the constructor is
ignored completely.

PR libstdc++/85494 use rand_s in std::random_device
* config/os/mingw32-w64/os_defines.h (_GLIBCXX_USE_CRT_RAND_S): Define.
* src/c++11/cow-string-inst.cc (random_device::_M_init_pretr1)
[_GLIBCXX_USE_CRT_RAND_S]: Do nothing if rand_s will be used.
* src/c++11/random.cc [_GLIBCXX_USE_CRT_RAND_S] (__winxp_rand_s):
Define new function.
(random_device::_M_init_pretr1) [_GLIBCXX_USE_CRT_RAND_S]: Do nothing
if rand_s will be used.
(random_device::_M_getval_pretr1) [_GLIBCXX_USE_CRT_RAND_S]: Use
__winxp_rand_s().
* testsuite/26_numerics/random/random_device/85494.cc: New test.


Tested x86_64-pc-linux-gnu and x86_64-w64-mingw32, committed to
gcc-9-branch.

commit f055311955f357f567cd38d2fe8da2af16c1c3ae
Author: redi 
Date:   Thu Jun 27 11:31:02 2019 +

PR libstdc++/85494 use rand_s in std::random_device

This is a minimal fix for the use of a deterministic RNG on mingw-w64,
simply using rand_s unconditionally. The rest of the r271740 changes on
trunk are not included. That means that RDSEED and RDRAND are not
available for mingw-w64 and the token passed to the constructor is
ignored completely.

PR libstdc++/85494 use rand_s in std::random_device
* config/os/mingw32-w64/os_defines.h (_GLIBCXX_USE_CRT_RAND_S): 
Define.
* src/c++11/cow-string-inst.cc (random_device::_M_init_pretr1)
[_GLIBCXX_USE_CRT_RAND_S]: Do nothing if rand_s will be used.
* src/c++11/random.cc [_GLIBCXX_USE_CRT_RAND_S] (__winxp_rand_s):
Define new function.
(random_device::_M_init_pretr1) [_GLIBCXX_USE_CRT_RAND_S]: Do 
nothing
if rand_s will be used.
(random_device::_M_getval_pretr1) [_GLIBCXX_USE_CRT_RAND_S]: Use
__winxp_rand_s().
* testsuite/26_numerics/random/random_device/85494.cc: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@272748 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/config/os/mingw32-w64/os_defines.h 
b/libstdc++-v3/config/os/mingw32-w64/os_defines.h
index 6c19d3953a6..418c6f569df 100644
--- a/libstdc++-v3/config/os/mingw32-w64/os_defines.h
+++ b/libstdc++-v3/config/os/mingw32-w64/os_defines.h
@@ -88,4 +88,6 @@
 // See libstdc++/59807
 #define _GTHREAD_USE_MUTEX_INIT_FUNC 1
 
+#define _GLIBCXX_USE_CRT_RAND_S 1
+
 #endif
diff --git a/libstdc++-v3/src/c++11/cow-string-inst.cc 
b/libstdc++-v3/src/c++11/cow-string-inst.cc
index c36f297438e..a5e4ad29dc3 100644
--- a/libstdc++-v3/src/c++11/cow-string-inst.cc
+++ b/libstdc++-v3/src/c++11/cow-string-inst.cc
@@ -77,8 +77,9 @@ namespace std _GLIBCXX_VISIBILITY(default)
   }
 
   void
-  random_device::_M_init_pretr1(const std::string& token)
+  random_device::_M_init_pretr1(const std::string& token [[gnu::unused]])
   {
+#ifndef _GLIBCXX_USE_CRT_RAND_S
 unsigned long __seed = 5489UL;
 if (token != "mt19937")
   {
@@ -90,6 +91,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 "(const std::string&)"));
   }
 _M_mt.seed(__seed);
+#endif
   }
 } // namespace
 #endif
diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index 1146d21c3f9..0bb6b6bf5b8 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -23,6 +23,8 @@
 // .
 
 #define _GLIBCXX_USE_CXX11_ABI 1
+#define _CRT_RAND_S // define this before including  to get rand_s
+
 #include 
 
 #ifdef  _GLIBCXX_USE_C99_STDINT_TR1
@@ -50,6 +52,10 @@
 # include 
 #endif
 
+#ifdef _GLIBCXX_USE_CRT_RAND_S
+# include 
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
   namespace
@@ -85,6 +91,18 @@ namespace std _GLIBCXX_VISIBILITY(default)
   return val;
 }
 #endif
+
+#ifdef _GLIBCXX_USE_CRT_RAND_S
+# pragma GCC poison _M_mt
+unsigned int
+__winxp_rand_s()
+{
+  unsigned int val;
+  if (::rand_s() != 0)
+   std::__throw_runtime_error(__N("random_device: rand_s failed"));
+  return val;
+}
+#endif
   }
 
   void
@@ -122,9 +140,11 @@ namespace std _GLIBCXX_VISIBILITY(default)
   }
 
   void
-  random_device::_M_init_pretr1(const std::string& token)
+  random_device::_M_init_pretr1(const std::string& token [[gnu::unused]])
   {
+#ifndef _GLIBCXX_USE_CRT_RAND_S
 _M_mt.seed(_M_strtoul(token));
+#endif
   }
 
   void
@@ -170,7 +190,11 @@ namespace std _GLIBCXX_VISIBILITY(default)
   random_device::result_type
   random_device::_M_getval_pretr1()
   {
+#ifdef _GLIBCXX_USE_CRT_RAND_S
+return __winxp_rand_s();
+#else
 return _M_mt();
+#endif
   }
 
   

Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default

2019-06-27 Thread Segher Boessenkool
Hi Bill,

On Wed, Jun 26, 2019 at 10:23:06PM -0500, Bill Schmidt wrote:
> 2019-06-27  Bill Schmidt  
> 
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
>   -fvariable-expansion-in-unroller by default.
> 
> 
> --- gcc/config/rs6000/rs6000.c(revision 272719)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -3616,6 +3616,11 @@ rs6000_option_override_internal (bool global_init_
>&& !global_options_set.x_flag_asynchronous_unwind_tables)
>  flag_asynchronous_unwind_tables = 1;
>  
> +  /* -fvariable-expansion-in-unroller is a win for POWER whenever the
> + loop unroller is active.  It is only checked during unrolling, so
> + we can just set it on by default.  */
> +  flag_variable_expansion_in_unroller = 1;

You should test global_options_set like the asynch unwind thing above,
so that a user can still turn off variable expansion if he/she so desires.


Okay with that change.  Thanks!


Segher


[PATCH] Add late non-iterating FRE with optimize > 1

2019-06-27 Thread Richard Biener


This fixes FREs handling of TARGET_MEM_REF (it didn't consider
_MEM_REF) and adds a late FRE pass which has iteration
disabled and runs only at -O[2s]+ to limit the compile-time
impact.

This helps cases where unrolling and vectorization exposes
"piecewise" redundancies DOM cannot handle.  Thus

 (vector *) = { 1, 2, 3, 4 };
 .. = a[2];

there's still the opposite case not handled (PR83518) but
I will see whether I can make it work without too much cost:

 a[0] = 1;
 a[1] = 2;
 a[2] = 3;
 a[3] = 4;
 ... = (vector *)

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

I'll commit the TARGET_MEM_REF fixing indepenently.

Any comments?  I'm not sure I like globbing the iteration
parameter and the optimize > 1 check; maybe I should simply
rename it to 'late' ...

The compile-time impact might be non-trivial for those testcases
that run into a large overhead from the alias-stmt walking but
I didn't do any measurements yet.

Thanks,
Richard.

2019-06-27  Richard Biener  

* tree-ssa-sccvn.c (class pass_fre): Add may_iterate
pass parameter.
(pass_fre::execute): Honor it.
* passes.def: Adjust pass_fre invocations to allow iterating,
add non-iterating pass_fre before late threading/dom.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 272742)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -791,39 +791,6 @@ vn_reference_eq (const_vn_reference_t co
 static void
 copy_reference_ops_from_ref (tree ref, vec *result)
 {
-  if (TREE_CODE (ref) == TARGET_MEM_REF)
-{
-  vn_reference_op_s temp;
-
-  result->reserve (3);
-
-  memset (, 0, sizeof (temp));
-  temp.type = TREE_TYPE (ref);
-  temp.opcode = TREE_CODE (ref);
-  temp.op0 = TMR_INDEX (ref);
-  temp.op1 = TMR_STEP (ref);
-  temp.op2 = TMR_OFFSET (ref);
-  temp.off = -1;
-  temp.clique = MR_DEPENDENCE_CLIQUE (ref);
-  temp.base = MR_DEPENDENCE_BASE (ref);
-  result->quick_push (temp);
-
-  memset (, 0, sizeof (temp));
-  temp.type = NULL_TREE;
-  temp.opcode = ERROR_MARK;
-  temp.op0 = TMR_INDEX2 (ref);
-  temp.off = -1;
-  result->quick_push (temp);
-
-  memset (, 0, sizeof (temp));
-  temp.type = NULL_TREE;
-  temp.opcode = TREE_CODE (TMR_BASE (ref));
-  temp.op0 = TMR_BASE (ref);
-  temp.off = -1;
-  result->quick_push (temp);
-  return;
-}
-
   /* For non-calls, store the information that makes up the address.  */
   tree orig = ref;
   while (ref)
@@ -853,6 +820,20 @@ copy_reference_ops_from_ref (tree ref, v
  temp.base = MR_DEPENDENCE_BASE (ref);
  temp.reverse = REF_REVERSE_STORAGE_ORDER (ref);
  break;
+   case TARGET_MEM_REF:
+ /* The base address gets its own vn_reference_op_s structure.  */
+ temp.op0 = TMR_INDEX (ref);
+ temp.op1 = TMR_STEP (ref);
+ temp.op2 = TMR_OFFSET (ref);
+ temp.clique = MR_DEPENDENCE_CLIQUE (ref);
+ temp.base = MR_DEPENDENCE_BASE (ref);
+ result->safe_push (temp);
+ memset (, 0, sizeof (temp));
+ temp.type = NULL_TREE;
+ temp.opcode = ERROR_MARK;
+ temp.op0 = TMR_INDEX2 (ref);
+ temp.off = -1;
+ break;
case BIT_FIELD_REF:
  /* Record bits, position and storage order.  */
  temp.op0 = TREE_OPERAND (ref, 1);
@@ -6872,14 +6853,24 @@ class pass_fre : public gimple_opt_pass
 {
 public:
   pass_fre (gcc::context *ctxt)
-: gimple_opt_pass (pass_data_fre, ctxt)
+: gimple_opt_pass (pass_data_fre, ctxt), may_iterate (true)
   {}
 
   /* opt_pass methods: */
   opt_pass * clone () { return new pass_fre (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_fre != 0; }
+  void set_pass_param (unsigned int n, bool param)
+{
+  gcc_assert (n == 0);
+  may_iterate = param;
+}
+  virtual bool gate (function *)
+{
+  return flag_tree_fre != 0 && (may_iterate || optimize > 1);
+}
   virtual unsigned int execute (function *);
 
+private:
+  bool may_iterate;
 }; // class pass_fre
 
 unsigned int
@@ -6888,15 +6879,16 @@ pass_fre::execute (function *fun)
   unsigned todo = 0;
 
   /* At -O[1g] use the cheap non-iterating mode.  */
+  bool iterate_p = may_iterate && (optimize > 1);
   calculate_dominance_info (CDI_DOMINATORS);
-  if (optimize > 1)
+  if (iterate_p)
 loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
   default_vn_walk_kind = VN_WALKREWRITE;
-  todo = do_rpo_vn (fun, NULL, NULL, optimize > 1, true);
+  todo = do_rpo_vn (fun, NULL, NULL, iterate_p, true);
   free_rpo_vn ();
 
-  if (optimize > 1)
+  if (iterate_p)
 loop_optimizer_finalize ();
 
   return todo;
Index: gcc/passes.def
===
--- gcc/passes.def  (revision 272742)
+++ gcc/passes.def  (working copy)
@@ -83,7 +83,7 @@ along with GCC; see the file COPYING3.
 

[PATCH] Fix various issues seen with clang-static-analyzer.

2019-06-27 Thread Martin Liška
Hi.

The patch addressed couple of issues that are explained and I should
have permission to install the patch.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin

gcc/ChangeLog:

2019-06-27  Martin Liska  

PR tree-optimization/90974
PR rtl-optimization/90975
PR rtl-optimization/90976
PR target/91016
PR tree-optimization/91017
* config/i386/i386-expand.c (ix86_expand_rounddf_32): Remove
unused tmp.
* lra.c (lra_set_insn_recog_data): Remove a leftover from
initial commit of IRA.
* optabs.c (expand_twoval_binop): Use xop0 and xop1 instead
of op0 and op1.
* tree-vect-loop.c (vect_create_epilog_for_reduction):
Remove unused mode1.
* tree-vect-stmts.c (vectorizable_call): Remove dead assignment
to new_stmt_info.
---
 gcc/config/i386/i386-expand.c | 5 ++---
 gcc/lra.c | 8 ++--
 gcc/optabs.c  | 4 ++--
 gcc/tree-vect-loop.c  | 1 -
 gcc/tree-vect-stmts.c | 6 ++
 5 files changed, 8 insertions(+), 16 deletions(-)


diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index d50b811d863..8a4955f87d2 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -16052,14 +16052,13 @@ ix86_expand_rounddf_32 (rtx operand0, rtx operand1)
 			   0, OPTAB_DIRECT);
 
   /* Compensate.  */
-  tmp = gen_reg_rtx (mode);
   /* xa2 = xa2 - (dxa > 0.5 ? 1 : 0) */
   tmp = ix86_expand_sse_compare_mask (UNGT, dxa, half, false);
-  emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, one, tmp)));
+  emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, tmp, one)));
   xa2 = expand_simple_binop (mode, MINUS, xa2, tmp, NULL_RTX, 0, OPTAB_DIRECT);
   /* xa2 = xa2 + (dxa <= -0.5 ? 1 : 0) */
   tmp = ix86_expand_sse_compare_mask (UNGE, mhalf, dxa, false);
-  emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, one, tmp)));
+  emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, tmp, one)));
   xa2 = expand_simple_binop (mode, PLUS, xa2, tmp, NULL_RTX, 0, OPTAB_DIRECT);
 
   /* res = copysign (xa2, operand1) */
diff --git a/gcc/lra.c b/gcc/lra.c
index bef2f676a78..982a3cc630b 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -1029,12 +1029,8 @@ lra_set_insn_recog_data (rtx_insn *insn)
 			   data->operand_loc,
 			   constraints, operand_mode, NULL);
 	  if (nop > 0)
-	{
-	  const char *p =  recog_data.constraints[0];
-
-	  for (p =	constraints[0]; *p; p++)
-		nalt += *p == ',';
-	}
+	for (const char *p =constraints[0]; *p; p++)
+	  nalt += *p == ',';
 	  data->insn_static_data = insn_static_data
 	= get_static_insn_data (-1, nop, 0, nalt);
 	  for (i = 0; i < nop; i++)
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 5a718e7f635..18ca7370917 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -2095,8 +2095,8 @@ expand_twoval_binop (optab binoptab, rtx op0, rtx op1, rtx targ0, rtx targ1,
   xop1 = avoid_expensive_constant (mode1, binoptab, 1, xop1, unsignedp);
 
   create_fixed_operand ([0], targ0);
-  create_convert_operand_from ([1], op0, mode, unsignedp);
-  create_convert_operand_from ([2], op1, mode, unsignedp);
+  create_convert_operand_from ([1], xop0, mode, unsignedp);
+  create_convert_operand_from ([2], xop1, mode, unsignedp);
   create_fixed_operand ([3], targ1);
   if (maybe_expand_insn (icode, 4, ops))
 	return 1;
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index b37bf6f427d..a6b0e5c5fdf 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -5431,7 +5431,6 @@ vect_create_epilog_for_reduction (vec vect_defs,
 dump_printf_loc (MSG_NOTE, vect_location,
 			 "Reduce using vector shifts\n");
 
-	  mode1 = TYPE_MODE (vectype1);
   vec_dest = vect_create_destination_var (scalar_dest, vectype1);
   for (elt_offset = nelements / 2;
elt_offset >= 1;
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 47da2953dc7..415ac0c8679 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -3483,8 +3483,7 @@ vectorizable_call (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 			= gimple_build_call_internal_vec (ifn, vargs);
 		  gimple_call_set_lhs (call, half_res);
 		  gimple_call_set_nothrow (call, true);
-		  new_stmt_info
-			= vect_finish_stmt_generation (stmt_info, call, gsi);
+		  vect_finish_stmt_generation (stmt_info, call, gsi);
 		  if ((i & 1) == 0)
 			{
 			  prev_res = half_res;
@@ -3583,8 +3582,7 @@ vectorizable_call (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 	  gcall *call = gimple_build_call_internal_vec (ifn, vargs);
 	  gimple_call_set_lhs (call, half_res);
 	  gimple_call_set_nothrow (call, true);
-	  new_stmt_info
-		= vect_finish_stmt_generation (stmt_info, call, gsi);
+	  vect_finish_stmt_generation (stmt_info, call, gsi);
 	  if ((j & 1) == 0)
 		{
 		  prev_res = half_res;



Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Richard Biener
On Thu, 27 Jun 2019, Jan Hubicka wrote:

> Hi,
> here is update patch I am re-testing. Ok if it suceeds?

+ if (!type_in_anonymous_namespace_p (t))
+   hash = htab_hash_string (IDENTIFIER_POINTER
+  (DECL_ASSEMBLER_NAME
+  (TYPE_NAME (t;
+ else
+   hash = TYPE_UID (t);
+ num_canonical_type_hash_entries++;
+ bool existed_p = canonical_type_hash_cache->put (prevail, hash);

but you still cache the hash for prevail rather than t while we
are eventually asking for the hash of t again?  I suppose not since
we just adjusted TYPE_CANONICAL and we're then supposed to pick
the hash from the canonical type?  I think this deserves a comment.
And it should use 'prevail' then everywhere here for consistency,
otherwise it really looks odd.

> @@ -357,7 +367,7 @@ iterative_hash_canonical_type (tree type  
>optimal order.  To avoid quadratic behavior also register the
>type here.  */
>v = hash_canonical_type (type);
> -  gimple_register_canonical_type_1 (type, v);
> +  v = gimple_register_canonical_type_1 (type, v);
>  }
>hstate.add_int (v);
>  }

Just noticed this should be

hstate.merge_hash (v);

results in the very same effect but is clearer IMHO.

Richard.


> Orace quary stats finished in meantime.
> 
> Alias oracle query stats:
>   refs_may_alias_p: 39232255 disambiguations, 47436580 queries
>   ref_maybe_used_by_call_p: 59801 disambiguations, 39811399 queries
>   call_may_clobber_ref_p: 5967 disambiguations, 9009 queries
>   nonoverlapping_component_refs_p: 10184 disambiguations, 754231 queries
>   nonoverlapping_component_refs_of_decl_p: 6488 disambiguations, 214611 
> queries
>   aliasing_component_refs_p: 88820 disambiguations, 291741 queries
>   TBAA oracle: 11687911 disambiguations 34188770 queries
>13343414 are in alias set 0
>5417962 queries asked about the same object
>148 queries asked about the same alias set
>0 access volatile
>1672142 are dependent in the DAG
>2067193 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 502249 disambiguations, 7180660 queries
>   pt_solutions_intersect: 357826 disambiguations, 7059723 queries
> 
> So there is no big change in TBAA effectivity compared to previous patch.
> There is however nice improvements in aliasing_component_refs_p 
> 61523 -> 88820 which I think is due to Jason's change
> 
> Honza
> 
> Index: cp/class.c
> ===
> --- cp/class.c(revision 272732)
> +++ cp/class.c(working copy)
> @@ -6395,6 +6395,7 @@ layout_class_type (tree t, tree *virtual
>SET_TYPE_ALIGN (base_t, rli->record_align);
>TYPE_USER_ALIGN (base_t) = TYPE_USER_ALIGN (t);
>TYPE_TYPELESS_STORAGE (base_t) = TYPE_TYPELESS_STORAGE (t);
> +  TYPE_CXX_ODR_P (base_t) = TYPE_CXX_ODR_P (t);
>  
>/* Copy the non-static data members of T. This will include its
>direct non-virtual bases & vtable.  */
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c  (revision 272732)
> +++ ipa-devirt.c  (working copy)
> @@ -213,6 +213,8 @@ struct GTY(()) odr_type_d
>bool odr_violated;
>/* Set when virtual table without RTTI previaled table with.  */
>bool rtti_broken;
> +  /* Set when the canonical type is determined using the type name.  */
> +  bool tbaa_enabled;
>  };
>  
>  /* Return TRUE if all derived types of T are known and thus
> @@ -1610,6 +1612,9 @@ add_type_duplicate (odr_type val, tree t
>  
>val->types_set->add (type);
>  
> +  if (!odr_hash)
> +return NULL;
> +
>gcc_checking_assert (can_be_name_hashed_p (type)
>  && can_be_name_hashed_p (val->type));
>  
> @@ -1989,6 +1994,46 @@ prevailing_odr_type (tree type)
>return t->type;
>  }
>  
> +/* Set tbaa_enabled flag for TYPE.  */
> +
> +void
> +enable_odr_based_tbaa (tree type)
> +{
> +  odr_type t = get_odr_type (type, true);
> +  t->tbaa_enabled = true;
> +}
> +
> +/* True if canonical type of TYPE is determined using ODR name.  */
> +
> +bool
> +odr_based_tbaa_p (const_tree type)
> +{
> +  if (!RECORD_OR_UNION_TYPE_P (type))
> +return false;
> +  odr_type t = get_odr_type (const_cast  (type), false);
> +  if (!t || !t->tbaa_enabled)
> +return false;
> +  return true;
> +}
> +
> +/* Set TYPE_CANONICAL of type and all its variants and duplicates
> +   to CANONICAL.  */
> +
> +void
> +set_type_canonical_for_odr_type (tree type, tree canonical)
> +{
> +  odr_type t = get_odr_type (type, false);
> +  unsigned int i;
> +  tree tt;
> +
> +  for (tree t2 = t->type; t2; t2 = TYPE_NEXT_VARIANT (t2))
> +TYPE_CANONICAL (t2) = canonical;
> +  if (t->types)
> +FOR_EACH_VEC_ELT (*t->types, i, tt)
> +  

Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich  wrote:
>
> Without these constraints asm() can't make use of mask registers.

asm should be deprecated. We have intrinsics for this purpose.

Uros.

> gcc/
> 2019-06-27  Jan Beulich  
>
> * config/i386/constraints.md: Remove @internal from "k" and
> "Yk".
>
> --- a/gcc/config/i386/constraints.md
> +++ b/gcc/config/i386/constraints.md
> @@ -79,10 +79,10 @@
>   "Second from top of 80387 floating-point stack (@code{%st(1)}).")
>
>  (define_register_constraint "Yk" "TARGET_AVX512F ? MASK_REGS : NO_REGS"
> -"@internal Any mask register that can be used as predicate, i.e. k1-k7.")
> +"Any mask register that can be used as predicate, i.e. k1-k7.")
>
>  (define_register_constraint "k" "TARGET_AVX512F ? ALL_MASK_REGS : NO_REGS"
> -"@internal Any mask register.")
> +"Any mask register.")
>
>  ;; Vector registers (also used for plain floating point nowadays).
>  (define_register_constraint "y" "TARGET_MMX ? MMX_REGS : NO_REGS"
>
>


Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Jan Hubicka
Hi,
here is update patch I am re-testing. Ok if it suceeds?

Orace quary stats finished in meantime.

Alias oracle query stats:
  refs_may_alias_p: 39232255 disambiguations, 47436580 queries
  ref_maybe_used_by_call_p: 59801 disambiguations, 39811399 queries
  call_may_clobber_ref_p: 5967 disambiguations, 9009 queries
  nonoverlapping_component_refs_p: 10184 disambiguations, 754231 queries
  nonoverlapping_component_refs_of_decl_p: 6488 disambiguations, 214611 queries
  aliasing_component_refs_p: 88820 disambiguations, 291741 queries
  TBAA oracle: 11687911 disambiguations 34188770 queries
   13343414 are in alias set 0
   5417962 queries asked about the same object
   148 queries asked about the same alias set
   0 access volatile
   1672142 are dependent in the DAG
   2067193 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 502249 disambiguations, 7180660 queries
  pt_solutions_intersect: 357826 disambiguations, 7059723 queries

So there is no big change in TBAA effectivity compared to previous patch.
There is however nice improvements in aliasing_component_refs_p 
61523 -> 88820 which I think is due to Jason's change

Honza

Index: cp/class.c
===
--- cp/class.c  (revision 272732)
+++ cp/class.c  (working copy)
@@ -6395,6 +6395,7 @@ layout_class_type (tree t, tree *virtual
   SET_TYPE_ALIGN (base_t, rli->record_align);
   TYPE_USER_ALIGN (base_t) = TYPE_USER_ALIGN (t);
   TYPE_TYPELESS_STORAGE (base_t) = TYPE_TYPELESS_STORAGE (t);
+  TYPE_CXX_ODR_P (base_t) = TYPE_CXX_ODR_P (t);
 
   /* Copy the non-static data members of T. This will include its
 direct non-virtual bases & vtable.  */
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 272732)
+++ ipa-devirt.c(working copy)
@@ -213,6 +213,8 @@ struct GTY(()) odr_type_d
   bool odr_violated;
   /* Set when virtual table without RTTI previaled table with.  */
   bool rtti_broken;
+  /* Set when the canonical type is determined using the type name.  */
+  bool tbaa_enabled;
 };
 
 /* Return TRUE if all derived types of T are known and thus
@@ -1610,6 +1612,9 @@ add_type_duplicate (odr_type val, tree t
 
   val->types_set->add (type);
 
+  if (!odr_hash)
+return NULL;
+
   gcc_checking_assert (can_be_name_hashed_p (type)
   && can_be_name_hashed_p (val->type));
 
@@ -1989,6 +1994,46 @@ prevailing_odr_type (tree type)
   return t->type;
 }
 
+/* Set tbaa_enabled flag for TYPE.  */
+
+void
+enable_odr_based_tbaa (tree type)
+{
+  odr_type t = get_odr_type (type, true);
+  t->tbaa_enabled = true;
+}
+
+/* True if canonical type of TYPE is determined using ODR name.  */
+
+bool
+odr_based_tbaa_p (const_tree type)
+{
+  if (!RECORD_OR_UNION_TYPE_P (type))
+return false;
+  odr_type t = get_odr_type (const_cast  (type), false);
+  if (!t || !t->tbaa_enabled)
+return false;
+  return true;
+}
+
+/* Set TYPE_CANONICAL of type and all its variants and duplicates
+   to CANONICAL.  */
+
+void
+set_type_canonical_for_odr_type (tree type, tree canonical)
+{
+  odr_type t = get_odr_type (type, false);
+  unsigned int i;
+  tree tt;
+
+  for (tree t2 = t->type; t2; t2 = TYPE_NEXT_VARIANT (t2))
+TYPE_CANONICAL (t2) = canonical;
+  if (t->types)
+FOR_EACH_VEC_ELT (*t->types, i, tt)
+  for (tree t2 = tt; t2; t2 = TYPE_NEXT_VARIANT (t2))
+TYPE_CANONICAL (t2) = canonical;
+}
+
 /* Return true if we reported some ODR violation on TYPE.  */
 
 bool
Index: ipa-utils.h
===
--- ipa-utils.h (revision 272732)
+++ ipa-utils.h (working copy)
@@ -93,6 +93,9 @@ bool odr_or_derived_type_p (const_tree t
 bool odr_types_equivalent_p (tree type1, tree type2);
 bool odr_type_violation_reported_p (tree type);
 tree prevailing_odr_type (tree type);
+void enable_odr_based_tbaa (tree type);
+bool odr_based_tbaa_p (const_tree type);
+void set_type_canonical_for_odr_type (tree type, tree canonical);
 
 /* Return vector containing possible targets of polymorphic call E.
If COMPLETEP is non-NULL, store true if the list is complete. 
Index: lto/lto-common.c
===
--- lto/lto-common.c(revision 272732)
+++ lto/lto-common.c(working copy)
@@ -1,5 +1,5 @@
 /* Top-level LTO routines.
-   Copyright (C) 2009-2018 Free Software Foundation, Inc.
+   Copyright (C) 2009-2019 Free Software Foundation, Inc.
Contributed by CodeSourcery, Inc.
 
 This file is part of GCC.
@@ -56,6 +56,11 @@ along with GCC; see the file COPYING3.
 #include "attribs.h"
 #include "builtins.h"
 #include "lto-common.h"
+#include "tree-pretty-print.h"
+
+/* True when no new types are going to be streamd from the global stream.  */
+
+static bool type_streaming_finished = 

Re: [PATCH] x86: fix/improve vgf2p8affine*qb insns

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 12:49 PM Jan Beulich  wrote:
>
> >>> On 27.06.19 at 12:20,  wrote:
> > On Thu, Jun 27, 2019 at 10:57 AM Jan Beulich  wrote:
> >>
> >> - the affine transformations are not commutative (the two source
> >>   operands have entirely different meaning)
> >> - there's no need for three alternatives
> >> - the nonimmediate_operand/Bm combination can better be vector_operand/m
> >>
> >> gcc/
> >> 2019-06-27  Jan Beulich  
> >>
> >> * config/i386/sse.md (vgf2p8affineinvqb_,
> >> vgf2p8affineqb_): Drop % constraint modifier.
> >> Eliminate redundant alternative.  Use vector_operand plus "m"
> >> constraint.
> >
> > Please just drop % modifier and use vector_operand here. IIRC,
> > register allocator operates on constraints, it doesn't care for
> > predicates. But predicates shouldn't be more constrained than
> > constraints. So, having "m" instead of "Bm" is a bad idea with
> > vector_operand.
>
> Well, putting back the Bm is easy (if that's really needed). But do
> you also mean me to put back to redundant 3rd alternative?

It is not redundant, "x" and "v" are different register constraints.

Uros.


Re: [PATCH] x86: fix/improve vgf2p8affine*qb insns

2019-06-27 Thread Jan Beulich
>>> On 27.06.19 at 12:20,  wrote:
> On Thu, Jun 27, 2019 at 10:57 AM Jan Beulich  wrote:
>>
>> - the affine transformations are not commutative (the two source
>>   operands have entirely different meaning)
>> - there's no need for three alternatives
>> - the nonimmediate_operand/Bm combination can better be vector_operand/m
>>
>> gcc/
>> 2019-06-27  Jan Beulich  
>>
>> * config/i386/sse.md (vgf2p8affineinvqb_,
>> vgf2p8affineqb_): Drop % constraint modifier.
>> Eliminate redundant alternative.  Use vector_operand plus "m"
>> constraint.
> 
> Please just drop % modifier and use vector_operand here. IIRC,
> register allocator operates on constraints, it doesn't care for
> predicates. But predicates shouldn't be more constrained than
> constraints. So, having "m" instead of "Bm" is a bad idea with
> vector_operand.

Well, putting back the Bm is easy (if that's really needed). But do
you also mean me to put back to redundant 3rd alternative?

Jan




Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Jan Hubicka
> > Actually this was intended to be prevail in both cases, but it is
> > harmless.  The difference here is that anonymous types have
> > DECL_ASSEMBLED_NAME , so if we inserted them to the hash
> > table based on names they will all conflict.
> > 
> > Anonymous namespace types never have duplicated main variants, so
> > t==prevail here, but I will update the patch to use prevail since it is
> > much more obvious (I am also not sure what I was thinking of when
> > I typed TYPE_MAIN_VARIANT)
> 
> I think using 't' is more obvious here since that's what we should cache
> the hash value for.  That is, even in the !type_in_anonymous_namespace_p 
> (t) case you want to cache 't's hash, not prevails.  But yes,
> TYPE_UID (prevail) might be more obvious to you.

OK, I will use t on both: for !anonoymous_namespace types the names are
the same.

Honza
> 
> > > 
> > > Otherwise looks good.
> > > 
> > > You can commit the C++ FE change with the adjustment in case it
> > > fixes the reported verification ICEs.
> > 
> > It only fixes ICE WRT the sanity checking that all non-ODR types
> > are inserted first.  W/o that change the as-base type will get inserted
> > during streaming in and recursively we will insert ODR types early.
> > 
> > I am waiting for testcase WRT the other ODR ICE reported today. I think
> > tree.c when creating type variants wants to copy the flag: the wrong
> > type is va_list which is likely created by a backend bypassing the C++
> > hook.
> > 
> > I will re-test with these changes.
> > 
> > Honza
> > 
> 
> -- 
> Richard Biener 
> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)



Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Richard Biener
On Thu, 27 Jun 2019, Jan Hubicka wrote:

> > > +  if (RECORD_OR_UNION_TYPE_P (t)
> > > +  && odr_type_p (t) && !odr_type_violation_reported_p (t))
> > > +{
> > > +  /* Here we rely on fact that all non-ODR types was inserted into
> > > +  canonical type hash and thus we can safely detect conflicts between
> > > +  ODR types and interoperable non-ODR types.  */
> > > +  gcc_checking_assert (type_streaming_finished
> > > +&& TYPE_MAIN_VARIANT (t) == t);
> > > +  slot = htab_find_slot_with_hash (gimple_canonical_types, t, hash,
> > > +NO_INSERT);
> > > +  if (slot && !TYPE_CXX_ODR_P (*(tree *)slot))
> > > + {
> > > +   tree nonodr = *(tree *)slot;
> > > +   if (symtab->dump_file)
> > > + {
> > > +   char *name = cplus_demangle_v3
> > > +  (IDENTIFIER_POINTER
> > > +(DECL_ASSEMBLER_NAME (TYPE_NAME (t))), 0);
> > 
> > Ugh - do we really want to demangle here?  I think not.  Surely
> > not without -details or with -slim.  Readers can c++filt this
> > easily.
> 
> OK, i can dump mangled name. 
> Just print_generic_expr prints all instantiations of the tempalte same
> way which makes it look odd that there are so many types of same name.
> > > +   /* Set canonical for T and all other ODR equivalent duplicates
> > > +  including incomplete structures.  */
> > > +   set_type_canonical_for_odr_type (t, prevail);
> > > +   enable_odr_based_tbaa (t);
> > 
> > I suppose there never will be a set of ODR types with the same
> > prevailing type but some of them having a conflict with a nonodr
> > type and some not?
> 
> No, we check them by structural equality while verifying ODR violations
> that is more strict than gimple_canonical_types_compatible.
> > 
> > > +   if (!type_in_anonymous_namespace_p (t))
> > > + hash = htab_hash_string (IDENTIFIER_POINTER
> > > +(DECL_ASSEMBLER_NAME
> > > +(TYPE_NAME (prevail;
> > > +   else
> > > + hash = TYPE_UID (TYPE_MAIN_VARIANT (t));
> > > +   num_canonical_type_hash_entries++;
> > > +   bool existed_p = canonical_type_hash_cache->put (prevail, hash);
> > 
> > but those hash differently?  I think you wanted to put t, not prevail
> > here.  And you want to use the TYPE_UID of prevail as well?
> 
> Actually this was intended to be prevail in both cases, but it is
> harmless.  The difference here is that anonymous types have
> DECL_ASSEMBLED_NAME , so if we inserted them to the hash
> table based on names they will all conflict.
> 
> Anonymous namespace types never have duplicated main variants, so
> t==prevail here, but I will update the patch to use prevail since it is
> much more obvious (I am also not sure what I was thinking of when
> I typed TYPE_MAIN_VARIANT)

I think using 't' is more obvious here since that's what we should cache
the hash value for.  That is, even in the !type_in_anonymous_namespace_p 
(t) case you want to cache 't's hash, not prevails.  But yes,
TYPE_UID (prevail) might be more obvious to you.

> > 
> > Otherwise looks good.
> > 
> > You can commit the C++ FE change with the adjustment in case it
> > fixes the reported verification ICEs.
> 
> It only fixes ICE WRT the sanity checking that all non-ODR types
> are inserted first.  W/o that change the as-base type will get inserted
> during streaming in and recursively we will insert ODR types early.
> 
> I am waiting for testcase WRT the other ODR ICE reported today. I think
> tree.c when creating type variants wants to copy the flag: the wrong
> type is va_list which is likely created by a backend bypassing the C++
> hook.
> 
> I will re-test with these changes.
> 
> Honza
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Jan Hubicka
> > +  if (RECORD_OR_UNION_TYPE_P (t)
> > +  && odr_type_p (t) && !odr_type_violation_reported_p (t))
> > +{
> > +  /* Here we rely on fact that all non-ODR types was inserted into
> > +canonical type hash and thus we can safely detect conflicts between
> > +ODR types and interoperable non-ODR types.  */
> > +  gcc_checking_assert (type_streaming_finished
> > +  && TYPE_MAIN_VARIANT (t) == t);
> > +  slot = htab_find_slot_with_hash (gimple_canonical_types, t, hash,
> > +  NO_INSERT);
> > +  if (slot && !TYPE_CXX_ODR_P (*(tree *)slot))
> > +   {
> > + tree nonodr = *(tree *)slot;
> > + if (symtab->dump_file)
> > +   {
> > + char *name = cplus_demangle_v3
> > +(IDENTIFIER_POINTER
> > +  (DECL_ASSEMBLER_NAME (TYPE_NAME (t))), 0);
> 
> Ugh - do we really want to demangle here?  I think not.  Surely
> not without -details or with -slim.  Readers can c++filt this
> easily.

OK, i can dump mangled name. 
Just print_generic_expr prints all instantiations of the tempalte same
way which makes it look odd that there are so many types of same name.
> > + /* Set canonical for T and all other ODR equivalent duplicates
> > +including incomplete structures.  */
> > + set_type_canonical_for_odr_type (t, prevail);
> > + enable_odr_based_tbaa (t);
> 
> I suppose there never will be a set of ODR types with the same
> prevailing type but some of them having a conflict with a nonodr
> type and some not?

No, we check them by structural equality while verifying ODR violations
that is more strict than gimple_canonical_types_compatible.
> 
> > + if (!type_in_anonymous_namespace_p (t))
> > +   hash = htab_hash_string (IDENTIFIER_POINTER
> > +  (DECL_ASSEMBLER_NAME
> > +  (TYPE_NAME (prevail;
> > + else
> > +   hash = TYPE_UID (TYPE_MAIN_VARIANT (t));
> > + num_canonical_type_hash_entries++;
> > + bool existed_p = canonical_type_hash_cache->put (prevail, hash);
> 
> but those hash differently?  I think you wanted to put t, not prevail
> here.  And you want to use the TYPE_UID of prevail as well?

Actually this was intended to be prevail in both cases, but it is
harmless.  The difference here is that anonymous types have
DECL_ASSEMBLED_NAME , so if we inserted them to the hash
table based on names they will all conflict.

Anonymous namespace types never have duplicated main variants, so
t==prevail here, but I will update the patch to use prevail since it is
much more obvious (I am also not sure what I was thinking of when
I typed TYPE_MAIN_VARIANT)
> 
> Otherwise looks good.
> 
> You can commit the C++ FE change with the adjustment in case it
> fixes the reported verification ICEs.

It only fixes ICE WRT the sanity checking that all non-ODR types
are inserted first.  W/o that change the as-base type will get inserted
during streaming in and recursively we will insert ODR types early.

I am waiting for testcase WRT the other ODR ICE reported today. I think
tree.c when creating type variants wants to copy the flag: the wrong
type is va_list which is likely created by a backend bypassing the C++
hook.

I will re-test with these changes.

Honza


Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Richard Biener
On Thu, 27 Jun 2019, Jan Hubicka wrote:

> Hi,
> this is updated patch for ODR based canonical type calculation.  It makes use
> of TYPE_CXX_ODR_P instead of doing the guesswork and while thinking how to get
> rid of the quadratic behaviour of the hash I noticed that all the logic fits
> quite naturally into gimple_register_canonical_type_1.
> 
> We only need to arrange that all non-ODR types gets to hash before we has
> TYPE_CXX_ODR_P and since hash functions recursively populate the type hash
> we know that subtypes with linkage are processed before types.
> 
> From the WPA stats we now get relatively few non-ODR types building cc1plus
> [WPA] Compared 1062055 SCCs, 890518 collisions (0.838486)
> [WPA] Merged 1059104 SCCs
> [WPA] Merged 6025225 tree bodies
> [WPA] Merged 639655 types
> [WPA] 101515 types prevailed (177264 associated trees)
> [WPA] GIMPLE canonical type table: size 16381, 262 elements, 5685 searches, 
> 48 collisions (ratio: 0.008443)
> [WPA] GIMPLE canonical type pointer-map: 3548 elements, 14261 searches
> 
> So 262 distinct types compared to 1590 which is an effect of Jason's patch.
> There are 3286 ODR types having type canonical set by name and 910 have
> conflict to non-ODR type.
> 
> I am still waiting for the statistics of alias oracle, but for tramp3d there
> are no significant changes compared to the first patch.
> 
> lto-bootstrapped/regtested x86_64-linux, OK?
> 
>   * class.c (layout_class_type): Set TYPE_CXX_ODR_P for as-base
>   type copy.
> 
>   * ipa-devirt.c (odr_type_d): Add tbaa_enabled flag.
>   (add_type_duplicate): When odr hash is not allocated, to nothing.
>   (odr_based_tbaa_p): New function.
>   (set_type_canonical_for_odr_type): New function.
>   * ipa-utils.h (enable_odr_based_tbaa, odr_based_tbaa_p,
>   set_type_canonical_for_odr_type): New.
>   * lto-common.c: Include demangle.h and tree-pretty-print.h
>   (type_streaming_finished): New static var.
>   (gimple_register_canonical_type_1): Return updated hash; handle ODR
>   types.
>   (iterative_hash_canonical_type): Update use of
>   gimple_register_canonical_type_1.
>   * tree.c (gimple_canonical_types_compatible_p): ODR types with
>   ODR based TBAA are not equivalent to non-ODR types.
> 
>   * g++.dg/lto/alias-2_0.C: New testcase.
>   * g++.dg/lto/alias-2_1.C: New testcase.
> Index: cp/class.c
> ===
> --- cp/class.c(revision 272656)
> +++ cp/class.c(working copy)
> @@ -6395,6 +6395,7 @@ layout_class_type (tree t, tree *virtual
>SET_TYPE_ALIGN (base_t, rli->record_align);
>TYPE_USER_ALIGN (base_t) = TYPE_USER_ALIGN (t);
>TYPE_TYPELESS_STORAGE (base_t) = TYPE_TYPELESS_STORAGE (t);
> +  TYPE_CXX_ODR_P (base_t) = true;

 = TYPE_CXX_ODR_P (t);

would be much more obvious here.

>/* Copy the non-static data members of T. This will include its
>direct non-virtual bases & vtable.  */
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c  (revision 272656)
> +++ ipa-devirt.c  (working copy)
> @@ -213,6 +213,8 @@ struct GTY(()) odr_type_d
>bool odr_violated;
>/* Set when virtual table without RTTI previaled table with.  */
>bool rtti_broken;
> +  /* Set when the canonical type is determined using the type name.  */
> +  bool tbaa_enabled;
>  };
>  
>  /* Return TRUE if all derived types of T are known and thus
> @@ -1610,6 +1612,9 @@ add_type_duplicate (odr_type val, tree t
>  
>val->types_set->add (type);
>  
> +  if (!odr_hash)
> +return NULL;
> +
>gcc_checking_assert (can_be_name_hashed_p (type)
>  && can_be_name_hashed_p (val->type));
>  
> @@ -1989,6 +1994,46 @@ prevailing_odr_type (tree type)
>return t->type;
>  }
>  
> +/* Set tbaa_enabled flag for TYPE.  */
> +
> +void
> +enable_odr_based_tbaa (tree type)
> +{
> +  odr_type t = get_odr_type (type, true);
> +  t->tbaa_enabled = true;
> +}
> +
> +/* True if canonical type of TYPE is determined using ODR name.  */
> +
> +bool
> +odr_based_tbaa_p (const_tree type)
> +{
> +  if (!RECORD_OR_UNION_TYPE_P (type))
> +return false;
> +  odr_type t = get_odr_type (const_cast  (type), false);
> +  if (!t || !t->tbaa_enabled)
> +return false;
> +  return true;
> +}
> +
> +/* Set TYPE_CANONICAL of type and all its variants and duplicates
> +   to CANONICAL.  */
> +
> +void
> +set_type_canonical_for_odr_type (tree type, tree canonical)
> +{
> +  odr_type t = get_odr_type (type, false);
> +  unsigned int i;
> +  tree tt;
> +
> +  for (tree t2 = t->type; t2; t2 = TYPE_NEXT_VARIANT (t2))
> +TYPE_CANONICAL (t2) = canonical;
> +  if (t->types)
> +FOR_EACH_VEC_ELT (*t->types, i, tt)
> +  for (tree t2 = tt; t2; t2 = TYPE_NEXT_VARIANT (t2))
> +TYPE_CANONICAL (t2) = canonical;
> +}
> +
>  /* Return true if we reported some ODR violation on TYPE.  */

Re: [PATCH] x86: fix/improve vgf2p8affine*qb insns

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 10:57 AM Jan Beulich  wrote:
>
> - the affine transformations are not commutative (the two source
>   operands have entirely different meaning)
> - there's no need for three alternatives
> - the nonimmediate_operand/Bm combination can better be vector_operand/m
>
> gcc/
> 2019-06-27  Jan Beulich  
>
> * config/i386/sse.md (vgf2p8affineinvqb_,
> vgf2p8affineqb_): Drop % constraint modifier.
> Eliminate redundant alternative.  Use vector_operand plus "m"
> constraint.

Please just drop % modifier and use vector_operand here. IIRC,
register allocator operates on constraints, it doesn't care for
predicates. But predicates shouldn't be more constrained than
constraints. So, having "m" instead of "Bm" is a bad idea with
vector_operand.

Uros.
>
> gcc/testsuite/
> 2019-06-27  Jan Beulich  
>
> * gcc.target/i386/gfni-5.c: New.
>
> ---
> On top of this (in further patches) it looks like the Bm constraint could be
> dropped altogether. At the very least its combination with vector_operand is
> pointless, because both resolve to the same vector_memory_operand. Same goes
> for round{,_saeonly}_nimm_predicate, which too resolves to vector_operand.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -22072,56 +22072,53 @@
>"vpopcnt\t{%1, %0|%0, %1}")
>
>  (define_insn "vgf2p8affineinvqb_"
> -  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
> (unspec:VI1_AVX512F
> - [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
> -  (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")
> -  (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
> + [(match_operand:VI1_AVX512F 1 "register_operand" "0,v")
> +  (match_operand:VI1_AVX512F 2 "vector_operand" "xm,vm")
> +  (match_operand:QI 3 "const_0_to_255_operand" "n,n")]
>   UNSPEC_GF2P8AFFINEINV))]
>"TARGET_GFNI"
>"@
> gf2p8affineinvqb\t{%3, %2, %0| %0, %2, %3}
> -   vgf2p8affineinvqb\t{%3, %2, %1, %0| %0, %1, 
> %2, %3}
> vgf2p8affineinvqb\t{%3, %2, %1, %0| %0, %1, 
> %2, %3}"
> -  [(set_attr "isa" "noavx,avx,avx512f")
> -   (set_attr "prefix_data16" "1,*,*")
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "prefix_data16" "1,*")
> (set_attr "prefix_extra" "1")
> -   (set_attr "prefix" "orig,maybe_evex,evex")
> +   (set_attr "prefix" "orig,maybe_evex")
> (set_attr "mode" "")])
>
>  (define_insn "vgf2p8affineqb_"
> -  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
> (unspec:VI1_AVX512F
> - [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
> -  (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")
> -  (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
> + [(match_operand:VI1_AVX512F 1 "register_operand" "0,v")
> +  (match_operand:VI1_AVX512F 2 "vector_operand" "xm,vm")
> +  (match_operand:QI 3 "const_0_to_255_operand" "n,n")]
>   UNSPEC_GF2P8AFFINE))]
>"TARGET_GFNI"
>"@
> gf2p8affineqb\t{%3, %2, %0| %0, %2, %3}
> -   vgf2p8affineqb\t{%3, %2, %1, %0| %0, %1, 
> %2, %3}
> vgf2p8affineqb\t{%3, %2, %1, %0| %0, %1, 
> %2, %3}"
> -  [(set_attr "isa" "noavx,avx,avx512f")
> -   (set_attr "prefix_data16" "1,*,*")
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "prefix_data16" "1,*")
> (set_attr "prefix_extra" "1")
> -   (set_attr "prefix" "orig,maybe_evex,evex")
> +   (set_attr "prefix" "orig,maybe_evex")
> (set_attr "mode" "")])
>
>  (define_insn "vgf2p8mulb_"
> -  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,v")
> (unspec:VI1_AVX512F
> - [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
> -  (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")]
> + [(match_operand:VI1_AVX512F 1 "register_operand" "%0,v")
> +  (match_operand:VI1_AVX512F 2 "vector_operand" "xm,vm")]
>   UNSPEC_GF2P8MUL))]
>"TARGET_GFNI"
>"@
> gf2p8mulb\t{%2, %0| %0, %2}
> -   vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}
> vgf2p8mulb\t{%2, %1, %0| %0, %1, %2}"
> -  [(set_attr "isa" "noavx,avx,avx512f")
> -   (set_attr "prefix_data16" "1,*,*")
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "prefix_data16" "1,*")
> (set_attr "prefix_extra" "1")
> -   (set_attr "prefix" "orig,maybe_evex,evex")
> +   (set_attr "prefix" "orig,maybe_evex")
> (set_attr "mode" "")])
>
>  (define_insn "vpshrd_"
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/gfni-5.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse2 -mgfni" } */
> +
> +typedef char __attribute__((vector_size(16))) v16qi_t;
> +
> +v16qi_t test16a (v16qi_t x, v16qi_t a)
> +{
> +  asm volatile ("" : "+m" (a));
> +  

Re: [RFA][tree-optimization/90883] Improve DSE to handle redundant calls

2019-06-27 Thread Christophe Lyon
On Thu, 27 Jun 2019 at 04:23, Jeff Law  wrote:
>
> On 6/26/19 7:14 PM, Bill Schmidt wrote:
> > Looks like this patch breaks bootstrap.
> >
> > /home3/wschmidt/gcc/gcc-mainline-base/gcc/tree-ssa-dse.c: In function
> > 'void dse\
> > _optimize_redundant_stores(gimple*)':
> > /home3/wschmidt/gcc/gcc-mainline-base/gcc/tree-ssa-dse.c:649:46: error:
> > ISO C++\
> >  forbids converting a string constant to 'char*' [-Werror=write-strings]
> >   649 |   delete_dead_or_redundant_assignment (, "redundant");
> >   |  ^~~
> > /home3/wschmidt/gcc/gcc-mainline-base/gcc/tree-ssa-dse.c:651:40: error:
> > ISO C++\
> >  forbids converting a string constant to 'char*' [-Werror=write-strings]
> >   651 |   delete_dead_or_redundant_call (, "redundant");
> >   |^~~
> > /home3/wschmidt/gcc/gcc-mainline-base/gcc/tree-ssa-dse.c: In member
> > function 'v\
> > oid dse_dom_walker::dse_optimize_stmt(gimple_stmt_iterator*)':
> > /home3/wschmidt/gcc/gcc-mainline-base/gcc/tree-ssa-dse.c:979:41: error:
> > ISO C++\
> >  forbids converting a string constant to 'char*' [-Werror=write-strings]
> >   979 | delete_dead_or_redundant_call (gsi, "dead");
> >   | ^~
> > /home3/wschmidt/gcc/gcc-mainline-base/gcc/tree-ssa-dse.c:1007:39: error:
> > ISO C+\
> > + forbids converting a string constant to 'char*' [-Werror=write-strings]
> >  1007 |   delete_dead_or_redundant_call (gsi, "dead");
> >   |   ^~
> > /home3/wschmidt/gcc/gcc-mainline-base/gcc/tree-ssa-dse.c:1066:49: error:
> > ISO C+\
> > + forbids converting a string constant to 'char*' [-Werror=write-strings]
> >  1066 |   delete_dead_or_redundant_assignment (gsi, "dead");
> Strange as I know I've bootstrapped and tested.  I'll take care of it
>
> Thanks,
> jeff


Hi Jeff,

I've also noticed that
FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted
redundant store: .*.a = {}"
on aarch64

Christophe


Re: [PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Jan Beulich
>>> On 27.06.19 at 12:11,  wrote:
> Without these constraints asm() can't make use of mask registers.

Similarly it is entirely unclear to me how to use e.g. v4fmaddps or
vp2intersectd in asm(): For the former the respective "Yh"
constraint was dropped (oddly enough leaving its comment line in
place), and there's nothing I can spot for a mask register pair.

Using (in an attempt to find alternative options)

typedef float __attribute__((vector_size(256))) v64sf_t;

results in "impossible constraint" when using an operand of this
type in asm(), while

typedef unsigned __attribute__((mode(P2HI))) kpair_t;

gives "no data type for mode" even without any actual use.

Jan




[PATCH] x86: mark "k" and "Yk" constraints as non-internal

2019-06-27 Thread Jan Beulich
Without these constraints asm() can't make use of mask registers.

gcc/
2019-06-27  Jan Beulich  

* config/i386/constraints.md: Remove @internal from "k" and
"Yk".

--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -79,10 +79,10 @@
  "Second from top of 80387 floating-point stack (@code{%st(1)}).")
 
 (define_register_constraint "Yk" "TARGET_AVX512F ? MASK_REGS : NO_REGS"
-"@internal Any mask register that can be used as predicate, i.e. k1-k7.")
+"Any mask register that can be used as predicate, i.e. k1-k7.")
 
 (define_register_constraint "k" "TARGET_AVX512F ? ALL_MASK_REGS : NO_REGS"
-"@internal Any mask register.")
+"Any mask register.")
 
 ;; Vector registers (also used for plain floating point nowadays).
 (define_register_constraint "y" "TARGET_MMX ? MMX_REGS : NO_REGS"




Re: [PATCH] Improve avx_vec_concat (PR target/90991)

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 8:17 AM Jakub Jelinek  wrote:
>
> Hi!
>
> In the last two alternatives of avx_vec_concat, we can allow memory
> source, which optimizes the following testcases from weird
> vmovaps (%rdi), %xmm0
> vmovaps %xmm0, %xmm0
> and similar to just the first instruction.  I went through all the
> gen_avx_vec_concat* users and all of them ensure the middle operand is a
> register by force_reg or constraint, rather than by testing the predicate
> of the instruction, so I believe the additional condition on the instruction
> is fine and we don't need some expander to fix up the case when middle
> operand is a MEM and last operand is not zero.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2019-06-27  Jakub Jelinek  
>
> PR target/90991
> * config/i386/sse.md (avx_vec_concat): Use nonimmediate_operand
> instead of register_operand for operands[1], add m to its constraints
> if operands[2] uses "C" constraint.  Ensure in condition that if
> operands[2] is not 0, then operands[1] is not a MEM.  For last two
> alternatives, use unaligned loads instead of aligned if operands[1] is
> misaligned_operand.
>
> * gcc.target/i386/avx2-pr90991-1.c: New test.
> * gcc.target/i386/avx512dq-pr90991-2.c: New test.

OK with a nit below.

Thanks,
Uros.

>
> --- gcc/config/i386/sse.md.jj   2019-06-26 09:22:40.506567515 +0200
> +++ gcc/config/i386/sse.md  2019-06-26 09:59:15.271571330 +0200
> @@ -20743,9 +20743,10 @@ (define_insn "_
>  (define_insn "avx_vec_concat"
>[(set (match_operand:V_256_512 0 "register_operand" "=x,v,x,Yv")
> (vec_concat:V_256_512
> - (match_operand: 1 "register_operand" "x,v,x,v")
> + (match_operand: 1 "nonimmediate_operand" 
> "x,v,xm,vm")
>   (match_operand: 2 "nonimm_or_0_operand" 
> "xm,vm,C,C")))]
> -  "TARGET_AVX"
> +  "TARGET_AVX && (operands[2] == CONST0_RTX (mode)
> + || !MEM_P (operands[1]))"

Please put "&& (operands[2] ..." in a separate line.

>  {
>switch (which_alternative)
>  {
> @@ -20771,27 +20772,63 @@ (define_insn "avx_vec_concat"
>switch (get_attr_mode (insn))
> {
> case MODE_V16SF:
> - return "vmovaps\t{%1, %t0|%t0, %1}";
> + if (misaligned_operand (operands[1], mode))
> +   return "vmovups\t{%1, %t0|%t0, %1}";
> + else
> +   return "vmovaps\t{%1, %t0|%t0, %1}";
> case MODE_V8DF:
> - return "vmovapd\t{%1, %t0|%t0, %1}";
> + if (misaligned_operand (operands[1], mode))
> +   return "vmovupd\t{%1, %t0|%t0, %1}";
> + else
> +   return "vmovapd\t{%1, %t0|%t0, %1}";
> case MODE_V8SF:
> - return "vmovaps\t{%1, %x0|%x0, %1}";
> + if (misaligned_operand (operands[1], mode))
> +   return "vmovups\t{%1, %x0|%x0, %1}";
> + else
> +   return "vmovaps\t{%1, %x0|%x0, %1}";
> case MODE_V4DF:
> - return "vmovapd\t{%1, %x0|%x0, %1}";
> + if (misaligned_operand (operands[1], mode))
> +   return "vmovupd\t{%1, %x0|%x0, %1}";
> + else
> +   return "vmovapd\t{%1, %x0|%x0, %1}";
> case MODE_XI:
> - if (which_alternative == 2)
> -   return "vmovdqa\t{%1, %t0|%t0, %1}";
> - else if (GET_MODE_SIZE (mode) == 8)
> -   return "vmovdqa64\t{%1, %t0|%t0, %1}";
> + if (misaligned_operand (operands[1], mode))
> +   {
> + if (which_alternative == 2)
> +   return "vmovdqu\t{%1, %t0|%t0, %1}";
> + else if (GET_MODE_SIZE (mode) == 8)
> +   return "vmovdqu64\t{%1, %t0|%t0, %1}";
> + else
> +   return "vmovdqu32\t{%1, %t0|%t0, %1}";
> +   }
>   else
> -   return "vmovdqa32\t{%1, %t0|%t0, %1}";
> +   {
> + if (which_alternative == 2)
> +   return "vmovdqa\t{%1, %t0|%t0, %1}";
> + else if (GET_MODE_SIZE (mode) == 8)
> +   return "vmovdqa64\t{%1, %t0|%t0, %1}";
> + else
> +   return "vmovdqa32\t{%1, %t0|%t0, %1}";
> +   }
> case MODE_OI:
> - if (which_alternative == 2)
> -   return "vmovdqa\t{%1, %x0|%x0, %1}";
> - else if (GET_MODE_SIZE (mode) == 8)
> -   return "vmovdqa64\t{%1, %x0|%x0, %1}";
> + if (misaligned_operand (operands[1], mode))
> +   {
> + if (which_alternative == 2)
> +   return "vmovdqu\t{%1, %x0|%x0, %1}";
> + else if (GET_MODE_SIZE (mode) == 8)
> +   return "vmovdqu64\t{%1, %x0|%x0, %1}";
> + else
> +   return "vmovdqu32\t{%1, %x0|%x0, %1}";
> +   }
>   else
> -   return "vmovdqa32\t{%1, %x0|%x0, %1}";
> +   {
> + if (which_alternative == 2)
> +   return "vmovdqa\t{%1, %x0|%x0, %1}";
> + 

Re: [PATCH] Remove quite obvious dead assignments.

2019-06-27 Thread Richard Sandiford
Martin Liška  writes:
> On 6/26/19 1:39 PM, Jakub Jelinek wrote:
>> On Wed, Jun 26, 2019 at 12:57:15PM +0200, Martin Liška wrote:
>>> --- a/gcc/tree-vect-stmts.c
>>> +++ b/gcc/tree-vect-stmts.c
>>> @@ -3483,8 +3483,7 @@ vectorizable_call (stmt_vec_info stmt_info, 
>>> gimple_stmt_iterator *gsi,
>>> = gimple_build_call_internal_vec (ifn, vargs);
>>>   gimple_call_set_lhs (call, half_res);
>>>   gimple_call_set_nothrow (call, true);
>>> - new_stmt_info
>>> -   = vect_finish_stmt_generation (stmt_info, call, gsi);
>>> + vect_finish_stmt_generation (stmt_info, call, gsi);
>>>   if ((i & 1) == 0)
>>> {
>>>   prev_res = half_res;
>>> @@ -3583,8 +3582,7 @@ vectorizable_call (stmt_vec_info stmt_info, 
>>> gimple_stmt_iterator *gsi,
>>>   gcall *call = gimple_build_call_internal_vec (ifn, vargs);
>>>   gimple_call_set_lhs (call, half_res);
>>>   gimple_call_set_nothrow (call, true);
>>> - new_stmt_info
>>> -   = vect_finish_stmt_generation (stmt_info, call, gsi);
>>> + vect_finish_stmt_generation (stmt_info, call, gsi);
>>>   if ((j & 1) == 0)
>>> {
>>>   prev_res = half_res;
>> 
>> This looks wrong.
>> There should have been
>> SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt_info);
>> in between for slp_node, or the usual code like:
>>   if (cond_for_first)
>> STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt_info;
>>   else
>> STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt_info;
>>prev_stmt_info = new_stmt_info;
>> otherwise.  In any case, I think this should be dealt with separately.
>
> Likewise here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91017

I think this part of the patch is OK.  SLP_TREE_VEC_STMTS and
STMT_VINFO_VEC_STMT should only receive the final stmt for each vector
result, whereas these stmts are producing intermediate results.

The assignments were introduced as part of a mechanical patch to avoid
using vinfo_for_stmt to get stmt_infos that we'd just created.  I'd
missed that in these cases we didn't actually use vinfo_for_stmt.

Off-hand, I'm not sure whether we actually need to create the
stmt_info here as things stand, or whether we're really just using
vect_finish_stmt_generation to insert the stmt in the right way.
I did have some WIP patches that would need the stmt_info to be
created though.

Thanks,
Richard


Re: Use ODR for canonical types construction in LTO

2019-06-27 Thread Jan Hubicka
Hi,
this is updated patch for ODR based canonical type calculation.  It makes use
of TYPE_CXX_ODR_P instead of doing the guesswork and while thinking how to get
rid of the quadratic behaviour of the hash I noticed that all the logic fits
quite naturally into gimple_register_canonical_type_1.

We only need to arrange that all non-ODR types gets to hash before we has
TYPE_CXX_ODR_P and since hash functions recursively populate the type hash
we know that subtypes with linkage are processed before types.

>From the WPA stats we now get relatively few non-ODR types building cc1plus
[WPA] Compared 1062055 SCCs, 890518 collisions (0.838486)
[WPA] Merged 1059104 SCCs
[WPA] Merged 6025225 tree bodies
[WPA] Merged 639655 types
[WPA] 101515 types prevailed (177264 associated trees)
[WPA] GIMPLE canonical type table: size 16381, 262 elements, 5685 searches, 48 
collisions (ratio: 0.008443)
[WPA] GIMPLE canonical type pointer-map: 3548 elements, 14261 searches

So 262 distinct types compared to 1590 which is an effect of Jason's patch.
There are 3286 ODR types having type canonical set by name and 910 have
conflict to non-ODR type.

I am still waiting for the statistics of alias oracle, but for tramp3d there
are no significant changes compared to the first patch.

lto-bootstrapped/regtested x86_64-linux, OK?

* class.c (layout_class_type): Set TYPE_CXX_ODR_P for as-base
type copy.

* ipa-devirt.c (odr_type_d): Add tbaa_enabled flag.
(add_type_duplicate): When odr hash is not allocated, to nothing.
(odr_based_tbaa_p): New function.
(set_type_canonical_for_odr_type): New function.
* ipa-utils.h (enable_odr_based_tbaa, odr_based_tbaa_p,
set_type_canonical_for_odr_type): New.
* lto-common.c: Include demangle.h and tree-pretty-print.h
(type_streaming_finished): New static var.
(gimple_register_canonical_type_1): Return updated hash; handle ODR
types.
(iterative_hash_canonical_type): Update use of
gimple_register_canonical_type_1.
* tree.c (gimple_canonical_types_compatible_p): ODR types with
ODR based TBAA are not equivalent to non-ODR types.

* g++.dg/lto/alias-2_0.C: New testcase.
* g++.dg/lto/alias-2_1.C: New testcase.
Index: cp/class.c
===
--- cp/class.c  (revision 272656)
+++ cp/class.c  (working copy)
@@ -6395,6 +6395,7 @@ layout_class_type (tree t, tree *virtual
   SET_TYPE_ALIGN (base_t, rli->record_align);
   TYPE_USER_ALIGN (base_t) = TYPE_USER_ALIGN (t);
   TYPE_TYPELESS_STORAGE (base_t) = TYPE_TYPELESS_STORAGE (t);
+  TYPE_CXX_ODR_P (base_t) = true;
 
   /* Copy the non-static data members of T. This will include its
 direct non-virtual bases & vtable.  */
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 272656)
+++ ipa-devirt.c(working copy)
@@ -213,6 +213,8 @@ struct GTY(()) odr_type_d
   bool odr_violated;
   /* Set when virtual table without RTTI previaled table with.  */
   bool rtti_broken;
+  /* Set when the canonical type is determined using the type name.  */
+  bool tbaa_enabled;
 };
 
 /* Return TRUE if all derived types of T are known and thus
@@ -1610,6 +1612,9 @@ add_type_duplicate (odr_type val, tree t
 
   val->types_set->add (type);
 
+  if (!odr_hash)
+return NULL;
+
   gcc_checking_assert (can_be_name_hashed_p (type)
   && can_be_name_hashed_p (val->type));
 
@@ -1989,6 +1994,46 @@ prevailing_odr_type (tree type)
   return t->type;
 }
 
+/* Set tbaa_enabled flag for TYPE.  */
+
+void
+enable_odr_based_tbaa (tree type)
+{
+  odr_type t = get_odr_type (type, true);
+  t->tbaa_enabled = true;
+}
+
+/* True if canonical type of TYPE is determined using ODR name.  */
+
+bool
+odr_based_tbaa_p (const_tree type)
+{
+  if (!RECORD_OR_UNION_TYPE_P (type))
+return false;
+  odr_type t = get_odr_type (const_cast  (type), false);
+  if (!t || !t->tbaa_enabled)
+return false;
+  return true;
+}
+
+/* Set TYPE_CANONICAL of type and all its variants and duplicates
+   to CANONICAL.  */
+
+void
+set_type_canonical_for_odr_type (tree type, tree canonical)
+{
+  odr_type t = get_odr_type (type, false);
+  unsigned int i;
+  tree tt;
+
+  for (tree t2 = t->type; t2; t2 = TYPE_NEXT_VARIANT (t2))
+TYPE_CANONICAL (t2) = canonical;
+  if (t->types)
+FOR_EACH_VEC_ELT (*t->types, i, tt)
+  for (tree t2 = tt; t2; t2 = TYPE_NEXT_VARIANT (t2))
+TYPE_CANONICAL (t2) = canonical;
+}
+
 /* Return true if we reported some ODR violation on TYPE.  */
 
 bool
Index: ipa-utils.h
===
--- ipa-utils.h (revision 272656)
+++ ipa-utils.h (working copy)
@@ -93,6 +93,9 @@ bool odr_or_derived_type_p (const_tree t
 bool odr_types_equivalent_p (tree type1, tree type2);
 bool odr_type_violation_reported_p (tree 

[PATCH] PR libstdc++/91012 fixfilesystem_error::what() string

2019-06-27 Thread Jonathan Wakely

When I refactored the filesystem_error code I changed it to only use the
constructor parameter in the what() string, instead of the string
returned by system_error::what(). That meant it no longer included the
description of the error_code that system_error adds. This restores the
previous behaivour, as encouraged by the standard ("Implementations
should include the system_error::what() string and the pathnames of
path1 and path2 in the native format in the returned string").

PR libstdc++/91012
* src/c++17/fs_path.cc (filesystem_error::_Impl): Use a string_view
for the what_arg parameters.
(filesystem_error::filesystem_error): Pass system_error::what() to
the _Impl constructor.
* testsuite/27_io/filesystem/filesystem_error/cons.cc: Ensure that
filesystem_error::what() contains system_error::what().

Tested x86_64-linux, committed to trunk. I'll backport to gcc-9-branch
too.

commit 026d1259cc4b761c4ab638ab2bcd251f880cbd32
Author: redi 
Date:   Thu Jun 27 09:42:39 2019 +

PR libstdc++/91012 fixfilesystem_error::what() string

When I refactored the filesystem_error code I changed it to only use the
constructor parameter in the what() string, instead of the string
returned by system_error::what(). That meant it no longer included the
description of the error_code that system_error adds. This restores the
previous behaivour, as encouraged by the standard ("Implementations
should include the system_error::what() string and the pathnames of
path1 and path2 in the native format in the returned string").

PR libstdc++/91012
* src/c++17/fs_path.cc (filesystem_error::_Impl): Use a string_view
for the what_arg parameters.
(filesystem_error::filesystem_error): Pass system_error::what() to
the _Impl constructor.
* testsuite/27_io/filesystem/filesystem_error/cons.cc: Ensure that
filesystem_error::what() contains system_error::what().

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272739 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/src/c++17/fs_path.cc 
b/libstdc++-v3/src/c++17/fs_path.cc
index 82ac736f82a..14842452354 100644
--- a/libstdc++-v3/src/c++17/fs_path.cc
+++ b/libstdc++-v3/src/c++17/fs_path.cc
@@ -1928,20 +1928,20 @@ fs::hash_value(const path& p) noexcept
 
 struct fs::filesystem_error::_Impl
 {
-  _Impl(const string& what_arg, const path& p1, const path& p2)
+  _Impl(string_view what_arg, const path& p1, const path& p2)
   : path1(p1), path2(p2), what(make_what(what_arg, , ))
   { }
 
-  _Impl(const string& what_arg, const path& p1)
+  _Impl(string_view what_arg, const path& p1)
   : path1(p1), path2(), what(make_what(what_arg, , nullptr))
   { }
 
-  _Impl(const string& what_arg)
+  _Impl(string_view what_arg)
   : what(make_what(what_arg, nullptr, nullptr))
   { }
 
   static std::string
-  make_what(const std::string& s, const path* p1, const path* p2)
+  make_what(string_view s, const path* p1, const path* p2)
   {
 const std::string pstr1 = p1 ? p1->u8string() : std::string{};
 const std::string pstr2 = p2 ? p2->u8string() : std::string{};
@@ -1977,20 +1977,20 @@ template class std::__shared_ptr;
 fs::filesystem_error::
 filesystem_error(const string& what_arg, error_code ec)
 : system_error(ec, what_arg),
-  _M_impl(std::__make_shared<_Impl>(what_arg))
+  _M_impl(std::__make_shared<_Impl>(system_error::what()))
 { }
 
 fs::filesystem_error::
 filesystem_error(const string& what_arg, const path& p1, error_code ec)
 : system_error(ec, what_arg),
-  _M_impl(std::__make_shared<_Impl>(what_arg, p1))
+  _M_impl(std::__make_shared<_Impl>(system_error::what(), p1))
 { }
 
 fs::filesystem_error::
 filesystem_error(const string& what_arg, const path& p1, const path& p2,
 error_code ec)
 : system_error(ec, what_arg),
-  _M_impl(std::__make_shared<_Impl>(what_arg, p1, p2))
+  _M_impl(std::__make_shared<_Impl>(system_error::what(), p1, p2))
 { }
 
 fs::filesystem_error::~filesystem_error() = default;
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/filesystem_error/cons.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/filesystem_error/cons.cc
index 8b24541cbc6..1644d355339 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/filesystem_error/cons.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/filesystem_error/cons.cc
@@ -36,9 +36,10 @@ test01()
   const std::error_code ec = make_error_code(std::errc::is_a_directory);
   const std::filesystem::path p1 = "test/path/one";
   const std::filesystem::path p2 = "/test/path/two";
+  std::system_error syserr(ec, str);
 
   const filesystem_error e1(str, ec);
-  VERIFY( contains(e1.what(), str) );
+  VERIFY( contains(e1.what(), syserr.what()) );
   VERIFY( !contains(e1.what(), "[]") ); // no "empty path" in the string
   VERIFY( e1.path1().empty() );
   VERIFY( e1.path2().empty() );
@@ -47,7 +48,7 @@ test01()
   const filesystem_error 

Re: [PATCH] Fix ICE when __builtin_calloc has no LHS (PR tree-optimization/91014).

2019-06-27 Thread Richard Biener
On Thu, Jun 27, 2019 at 11:21 AM Martin Liška  wrote:
>
> Hi.
>
> This is quite an obvious changes I've noticed during fuzzing
> of s390x target compiler.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2019-06-27  Martin Liska  
>
> PR tree-optimization/91014
> * tree-ssa-dse.c (initialize_ao_ref_for_dse): Bail out
> when LHS is NULL_TREE.
>
> gcc/testsuite/ChangeLog:
>
> 2019-06-27  Martin Liska  
>
> PR tree-optimization/91014
> * gcc.target/s390/pr91014.c: New test.
> ---
>  gcc/testsuite/gcc.target/s390/pr91014.c | 8 
>  gcc/tree-ssa-dse.c  | 5 +++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/s390/pr91014.c
>
>


Re: introduce EH_ELSE tree and gimplifier

2019-06-27 Thread Richard Biener
On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva  wrote:
>
> I found GIMPLE_EH_ELSE offered exactly the semantics I needed for some
> Ada changes yet to be contributed, but GIMPLE_EH_ELSE was only built
> by GIMPLE passes, and I needed to build earlier something that
> eventually became GIMPLE_EH_ELSE.
>
> This patch does that, introducing an EH_ELSE tree, and logic to dump
> it and to gimplify it.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?
>
>
> for  gcc/ChangeLog
>
> * doc/generic.texi (Cleanups): Document EH_ELSE.
> * except.c: Likewise.
> * expr.c (expand_expr_real_1): Reject it.
> * gimplify.c (gimplify_expr): Gimplify it, within
> TRY_FINALLY_EXPR.
> * tree-dump.c (dequeue_and_dump): Dump it.
> * tree-pretty-print.c (dump_generic_node): Likewise.
> * tree.c (block_may_fallthru): Handle it.
> * tree.def (EH_ELSE): Introduce.
> ---
>  gcc/doc/generic.texi|5 +
>  gcc/except.c|   12 ++--
>  gcc/expr.c  |1 +
>  gcc/gimplify.c  |   18 +-
>  gcc/tree-dump.c |1 +
>  gcc/tree-pretty-print.c |   11 +++
>  gcc/tree.c  |3 +++
>  gcc/tree.def|7 +++
>  8 files changed, 51 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
> index 67f7ad53af6b..1d0e3cfec1d6 100644
> --- a/gcc/doc/generic.texi
> +++ b/gcc/doc/generic.texi
> @@ -2180,6 +2180,11 @@ After the second sequence is executed, if it completes 
> normally by
>  falling off the end, execution continues wherever the first sequence
>  would have continued, by falling off the end, or doing a goto, etc.
>
> +If the second sequence is an @code{EH_ELSE} selector, then the sequence
> +in its first operand is used when the first sequence completes normally,
> +and that in its second operand is used for exceptional cleanups, i.e.,
> +when an exception propagates out of the first sequence.
> +
>  @code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup
>  needs to appear on every edge out of the controlled block; this
>  reduces the freedom to move code across these edges.  Therefore, the
> diff --git a/gcc/except.c b/gcc/except.c
> index edaeeb4cfd1b..76cd099749f3 100644
> --- a/gcc/except.c
> +++ b/gcc/except.c
> @@ -27,14 +27,14 @@ along with GCC; see the file COPYING3.  If not see
> the compilation process:
>
> In the beginning, in the front end, we have the GENERIC trees
> -   TRY_CATCH_EXPR, TRY_FINALLY_EXPR, WITH_CLEANUP_EXPR,
> +   TRY_CATCH_EXPR, TRY_FINALLY_EXPR, EH_ELSE, WITH_CLEANUP_EXPR,
> CLEANUP_POINT_EXPR, CATCH_EXPR, and EH_FILTER_EXPR.
>
> -   During initial gimplification (gimplify.c) these are lowered
> -   to the GIMPLE_TRY, GIMPLE_CATCH, and GIMPLE_EH_FILTER nodes.
> -   The WITH_CLEANUP_EXPR and CLEANUP_POINT_EXPR nodes are converted
> -   into GIMPLE_TRY_FINALLY nodes; the others are a more direct 1-1
> -   conversion.
> +   During initial gimplification (gimplify.c) these are lowered to the
> +   GIMPLE_TRY, GIMPLE_CATCH, GIMPLE_EH_ELSE, and GIMPLE_EH_FILTER
> +   nodes.  The WITH_CLEANUP_EXPR and CLEANUP_POINT_EXPR nodes are
> +   converted into GIMPLE_TRY_FINALLY nodes; the others are a more
> +   direct 1-1 conversion.
>
> During pass_lower_eh (tree-eh.c) we record the nested structure
> of the TRY nodes in EH_REGION nodes in CFUN->EH->REGION_TREE.
> diff --git a/gcc/expr.c b/gcc/expr.c
> index c78bc74c0d9f..70fb721a9621 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -11292,6 +11292,7 @@ expand_expr_real_1 (tree exp, rtx target, 
> machine_mode tmode,
>  case CATCH_EXPR:
>  case EH_FILTER_EXPR:
>  case TRY_FINALLY_EXPR:
> +case EH_ELSE:
>/* Lowered by tree-eh.c.  */
>gcc_unreachable ();
>
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 0b25e7100cde..de4cb66e2b65 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -13074,7 +13074,22 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
> gimple_seq *post_p,
> input_location = UNKNOWN_LOCATION;
> eval = cleanup = NULL;
> gimplify_and_add (TREE_OPERAND (*expr_p, 0), );
> -   gimplify_and_add (TREE_OPERAND (*expr_p, 1), );
> +   if (TREE_CODE (*expr_p) == TRY_FINALLY_EXPR
> +   && TREE_CODE (TREE_OPERAND (*expr_p, 1)) == EH_ELSE)
> + {
> +   gimple_seq n = NULL, e = NULL;
> +   gimplify_and_add (TREE_OPERAND (TREE_OPERAND (*expr_p, 1),
> +   0), );
> +   gimplify_and_add (TREE_OPERAND (TREE_OPERAND (*expr_p, 1),
> +   1), );
> +   if (!gimple_seq_empty_p (n) && !gimple_seq_empty_p (e))
> + {
> +   geh_else *stmt = gimple_build_eh_else (n, e);
> +   gimple_seq_add_stmt (, stmt);
> + }
> + }

Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-27 Thread Rainer Orth
Hi Hongtao,

> On Thu, Jun 27, 2019 at 5:02 PM Rainer Orth  
> wrote:
>>
>> Hi Hongtao,
>>
>> >> as usual, the new effective-target keyword needs documenting in
>> >> sourcebuild.texi.
>> > Like this?
>>
>> a couple of nits: first of all, your mailer seems to replace tabs by a
>> single space.  Please fix this or attach the patch instead.
>>
>> > Index: ChangeLog
>> > ===
>> > --- ChangeLog (revision 272668)
>> > +++ ChangeLog (working copy)
>> > @@ -1,3 +1,8 @@
>> > +2019-06-27  Hongtao Liu  
>> > +
>> > + * doc/sourcebuild.texi: Document new effective target keyword
>> > + avx512vp2intersect.
>>
>> Please include the sections you're modifying, something like
>>
>> * doc/sourcebuild.texi (Effective-Target Keywords, Other
>> hardware attributes): Document avx512vp2intersect.
>>
>> And please don't include the ChangeLog in the patch, but include it in
>> the mail proper: it won't apply due to date and context changes anyway.
>> Best review https://gcc.gnu.org/contribute.html where this is documented
>> besides other points of patch submission.
>>
>> Besides, it's most likely useful to also review the GNU Coding
>> Standards, too, not only for ChangeLog formatting.
>>
>> > Index: testsuite/ChangeLog
>> > ===
>> > --- testsuite/ChangeLog (revision 272668)
>> > +++ testsuite/ChangeLog (working copy)
>> > @@ -1,3 +1,11 @@
>> > +2019-06-27  Hongtao Liu  
>> > +
>> > + * lib/target-supports.exp: Add
>> > + check_effective_target_avx512vp2intersect.
>>
>> Use
>>
>> * lib/target-supports.exp
>> (check_effective_target_avx512vp2intersect): New proc.
>>
>> > + * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Add
>> > + dg-require-effective-target avx512vp2intersect.
>>
>> Better:
>>
>> * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Require
>> avx512vp2intersect.
>>
>> but that's a matter of preference.
>>
>> > Index: testsuite/gcc.target/i386/avx512vp2intersect-2intersect-1b.c
>> > ===
>> > --- testsuite/gcc.target/i386/avx512vp2intersect-2intersect-1b.c
>> > (revision 272668)
>> > +++ testsuite/gcc.target/i386/avx512vp2intersect-2intersect-1b.c
>> > (working copy)
>> > @@ -1,5 +1,6 @@
>> >  /* { dg-do run } */
>> >  /* { dg-options "-O2 -mavx512vp2intersect" } */
>> > +/* { dg-require-effective-target "avx512vp2intersect" } */
>>
>> No need to quote avx512vp2intersect here and in the next test.
>>
>> Ok with those nits fixed.
>>
> Yes, thanks a lot.
>> Thanks.
>> Rainer
>>
>> --
>> -
>> Rainer Orth, Center for Biotechnology, Bielefeld University
>
> Ok for trunk?

ENOPATCH

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: allow EH to escape from GIMPLE_EH_ELSE ELSE block

2019-06-27 Thread Richard Biener
On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva  wrote:
>
> The only preexisting use of GIMPLE_EH_ELSE, for transactional memory
> commits, did not allow exceptions to escape from the ELSE path.  The
> trick it uses to allow the ELSE path to see the propagating exception
> does not work very well if the exception cleanup raises further
> exceptions: the ELSE block is configured to handle exceptions in
> itself.  This confuses the heck out of CFG and EH cleanups.
>
> Basing the lowering context for the ELSE block on outer_state, rather
> than this_state, gets us the expected enclosing handler.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

Testcase?

>
> for  gcc/ChangeLog
>
> * tree-eh.c (honor_protect_cleanup_actions): Use outer_
> rather than this_state as the lowering context for the ELSE
> seq in a GIMPLE_EH_ELSE.
> ---
>  gcc/tree-eh.c |   13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
> index 23c56b5661a1..4de09d1bf7b5 100644
> --- a/gcc/tree-eh.c
> +++ b/gcc/tree-eh.c
> @@ -1001,11 +1001,14 @@ honor_protect_cleanup_actions (struct leh_state 
> *outer_state,
>gimple_try_set_cleanup (tf->top_p, gimple_eh_else_n_body (eh_else));
>finally = gimple_eh_else_e_body (eh_else);
>
> -  /* Let the ELSE see the exception that's being processed.  */
> -  eh_region save_ehp = this_state->ehp_region;
> -  this_state->ehp_region = this_state->cur_region;
> -  lower_eh_constructs_1 (this_state, );
> -  this_state->ehp_region = save_ehp;
> +  /* Let the ELSE see the exception that's being processed, but
> +since the cleanup is outside the try block, process it with
> +outer_state, otherwise it may be used as a cleanup for
> +itself, and Bad Things (TM) ensue.  */
> +  eh_region save_ehp = outer_state->ehp_region;
> +  outer_state->ehp_region = this_state->cur_region;
> +  lower_eh_constructs_1 (outer_state, );
> +  outer_state->ehp_region = save_ehp;
>  }
>else
>  {
>
> --
> Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
> Be the change, be Free! FSF Latin America board member
> GNU Toolchain EngineerFree Software Evangelist
> Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara


Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default

2019-06-27 Thread Richard Biener
On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt  wrote:
>
> Hi,
>
> We've done some experimenting and realized that the subject option almost
> always provide improved performance for Power when the loop unroller is
> enabled.  So this patch turns that flag on by default for us.
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
> Is this OK for trunk?

I guess it creates more freedom for combine (more single-uses) and register
allocation.  I wonder in which cases this might pessimize things?  I guess
the pre-RA scheduler might make RAs life harder with creating overlapping
life-ranges.

I guess you didn't actually investigate the nature of the improvements you saw?

Do we want to adjust the flags documentation, saying whether this is enabled
by default depends on the target (or even list them)?

Thanks,
Richard.

> Thanks!
> Bill
>
>
> 2019-06-27  Bill Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
> -fvariable-expansion-in-unroller by default.
>
>
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c  (revision 272719)
> +++ gcc/config/rs6000/rs6000.c  (working copy)
> @@ -3616,6 +3616,11 @@ rs6000_option_override_internal (bool global_init_
>&& !global_options_set.x_flag_asynchronous_unwind_tables)
>  flag_asynchronous_unwind_tables = 1;
>
> +  /* -fvariable-expansion-in-unroller is a win for POWER whenever the
> + loop unroller is active.  It is only checked during unrolling, so
> + we can just set it on by default.  */
> +  flag_variable_expansion_in_unroller = 1;
> +
>/* Set the pointer size.  */
>if (TARGET_64BIT)
>  {
>


Re: [PATCH] Remove quite obvious dead assignments.

2019-06-27 Thread Martin Liška
On 6/26/19 1:39 PM, Jakub Jelinek wrote:
> On Wed, Jun 26, 2019 at 12:57:15PM +0200, Martin Liška wrote:
>> --- a/gcc/asan.c
>> +++ b/gcc/asan.c
>> @@ -1713,8 +1713,8 @@ asan_emit_allocas_unpoison (rtx top, rtx bot, rtx_insn 
>> *before)
>>rtx ret = init_one_libfunc ("__asan_allocas_unpoison");
>>top = convert_memory_address (ptr_mode, top);
>>bot = convert_memory_address (ptr_mode, bot);
>> -  ret = emit_library_call_value (ret, NULL_RTX, LCT_NORMAL, ptr_mode,
>> - top, ptr_mode, bot, ptr_mode);
>> +  emit_library_call_value (ret, NULL_RTX, LCT_NORMAL, ptr_mode,
>> +   top, ptr_mode, bot, ptr_mode);
> 
> If you don't need the return value, you should use emit_library_call,
> not emit_library_call_value.

Done.

> 
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -3044,7 +3044,7 @@ expand_asm_stmt (gasm *stmt)
>>}
>>  }
>>  }
>> -  unsigned nclobbers = clobber_rvec.length();
>> +  unsigned nclobbers;
> 
> Can you instead move the nclobbers declaration to the spot where it
> is initialized (right now the second time), and add space before (
> there too?

Sure.

> 
>> --- a/gcc/config/i386/i386-expand.c
>> +++ b/gcc/config/i386/i386-expand.c
>> @@ -16051,8 +16051,6 @@ ix86_expand_rounddf_32 (rtx operand0, rtx operand1)
>>mhalf = expand_simple_binop (mode, MINUS, half, one, NULL_RTX,
>> 0, OPTAB_DIRECT);
>>  
>> -  /* Compensate.  */
>> -  tmp = gen_reg_rtx (mode);
>>/* xa2 = xa2 - (dxa > 0.5 ? 1 : 0) */
>>tmp = ix86_expand_sse_compare_mask (UNGT, dxa, half, false);
>>emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, one, tmp)));
> 
> Is this really desirable?
> Just quick look suggests perhaps it wants to use one temporary
> for the gen_reg_rtx (mode) and use that on the lhs of SET, and another one
> as the last operand to AND holding ix86_expand_sse_compare_mask?
> Though, such problem is in most of the ix86_expand_sse_compare_mask callers.
> While ix86_expand_sse_compare_mask returns a result of gen_reg_rtx too,
> I think it is better not to reuse pseudos for different values.
> I'd say take out this change and deal with it with i386 maintainers
> separately.

Good point, I've just created:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91016

> 
>> --- a/gcc/tree-vect-stmts.c
>> +++ b/gcc/tree-vect-stmts.c
>> @@ -3483,8 +3483,7 @@ vectorizable_call (stmt_vec_info stmt_info, 
>> gimple_stmt_iterator *gsi,
>>  = gimple_build_call_internal_vec (ifn, vargs);
>>gimple_call_set_lhs (call, half_res);
>>gimple_call_set_nothrow (call, true);
>> -  new_stmt_info
>> -= vect_finish_stmt_generation (stmt_info, call, gsi);
>> +  vect_finish_stmt_generation (stmt_info, call, gsi);
>>if ((i & 1) == 0)
>>  {
>>prev_res = half_res;
>> @@ -3583,8 +3582,7 @@ vectorizable_call (stmt_vec_info stmt_info, 
>> gimple_stmt_iterator *gsi,
>>gcall *call = gimple_build_call_internal_vec (ifn, vargs);
>>gimple_call_set_lhs (call, half_res);
>>gimple_call_set_nothrow (call, true);
>> -  new_stmt_info
>> -= vect_finish_stmt_generation (stmt_info, call, gsi);
>> +  vect_finish_stmt_generation (stmt_info, call, gsi);
>>if ((j & 1) == 0)
>>  {
>>prev_res = half_res;
> 
> This looks wrong.
> There should have been
> SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt_info);
> in between for slp_node, or the usual code like:
>   if (cond_for_first)
> STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt_info;
>   else
> STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt_info;
> prev_stmt_info = new_stmt_info;
> otherwise.  In any case, I think this should be dealt with separately.

Likewise here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91017

I'm going to install updated patch once regression tests finish for it.
I excluded changes for the PR mentioned.

Martin

> 
>   Jakub
> 

>From 045440c628968c53cb10de317a3420912da81a93 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 26 Jun 2019 10:29:38 +0200
Subject: [PATCH] Remove quite obvious dead assignments.

gcc/ChangeLog:

2019-06-26  Martin Liska  

	* asan.c (asan_emit_allocas_unpoison): Remove obviously
	dead assignments.
	* bt-load.c (move_btr_def): Likewise.
	* builtins.c (expand_builtin_apply_args_1): Likewise.
	(expand_builtin_apply): Likewise.
	* cfgexpand.c (expand_asm_stmt): Likewise.
	(construct_init_block): Likewise.
	* cfghooks.c (verify_flow_info): Likewise.
	* cfgloopmanip.c (remove_path): Likewise.
	* cfgrtl.c (rtl_verify_bb_layout): Likewise.
	* cgraph.c (cgraph_node::set_pure_flag): Likewise.
	* combine.c (simplify_if_then_else): Likewise.
	* config/i386/i386.c (ix86_setup_incoming_vararg_bounds): Likewise.
	

Re: [PATCH] ix86: pass correct options to compiler for gfni-4 testcase

2019-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2019 at 10:58 AM Jan Beulich  wrote:
>
> SSE2 is the required prereq of the builtins; as x86-64 has SSE2 enabled
> anyway, the test failure was noticable on 32-bit builds only.
>
> gcc/testsuite/
> 2019-06-27  Jan Beulich  
>
> * gcc.target/i386/gfni-4.c: Pass -msse2.

OK.

Thanks,
Uros.

> --- a/gcc/testsuite/gcc.target/i386/gfni-4.c
> +++ b/gcc/testsuite/gcc.target/i386/gfni-4.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-mgfni -O2 -msse" } */
> +/* { dg-options "-mgfni -O2 -msse2" } */
>  /* { dg-final { scan-assembler-times "gf2p8affineinvqb\[ 
> \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ 
> \\t\]+#)" 1 } } */
>  /* { dg-final { scan-assembler-times "gf2p8affineqb\[ 
> \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ 
> \\t\]+#)" 1 } } */
>  /* { dg-final { scan-assembler-times "gf2p8mulb\[ 
> \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
>
>


  1   2   >