[RFA] [PATCH 4/4] Ignore reads of "dead" memory locations in DSE

2016-12-21 Thread Jeff Law

This is the final patch in the kit to improve our DSE implementation.

It's based on a observation by Richi.  Namely that a read from bytes of 
memory that are dead can be ignored.  By ignoring such reads we can 
sometimes find additional stores that allow us to either eliminate or 
trim an earlier store more aggressively.


This only hit (by hit I mean the ability to ignore resulted in finding a 
full or partially dead store that we didn't otherwise find) once during 
a bootstrap, but does hit often in the libstdc++ testsuite.  I've added 
a test derived from the conversation between myself and Richi last year.


There's nothing in the BZ database on this issue and I can't reasonably 
call it a bugfix.  I wouldn't lose sleep if this deferred to gcc-8.


Bootstrapped and regression tested on x86-64-linux-gnu.  OK for the 
trunk or defer to gcc-8?



* tree-ssa-dse.c (live_bytes_read): New function.
(dse_classify_store): Ignore reads of dead bytes.

* testsuite/gcc.dg/tree-ssa/ssa-dse-26.c: New test.
* testsuite/gcc.dg/tree-ssa/ssa-dse-26.c: Likewise.



diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
new file mode 100644
index 000..6605dfe
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse1-details" } */
+
+enum constraint_expr_type
+{
+  SCALAR, DEREF, ADDRESSOF
+};
+typedef struct constraint_expr
+{
+  enum constraint_expr_type type;
+  unsigned int var;
+  long offset;
+} constraint_expr ;
+typedef struct constraint
+{
+  struct constraint_expr lhs;
+  struct constraint_expr rhs;
+} constraint;
+static _Bool
+constraint_expr_equal (struct constraint_expr x, struct constraint_expr y)
+{
+  return x.type == y.type && x.var == y.var && x.offset == y.offset;
+}
+
+_Bool
+constraint_equal (struct constraint a, struct constraint b)
+{
+  return constraint_expr_equal (a.lhs, b.lhs)
+&& constraint_expr_equal (a.rhs, b.rhs);
+}
+
+/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-27.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-27.c
new file mode 100644
index 000..48dc92e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-27.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse1-details -fno-tree-fre -fno-tree-sra" } */
+
+struct S { struct R { int x; int y; } r; int z; } s;
+
+extern void blah (struct S);
+
+void
+foo ()
+{
+  struct S s = { {1, 2}, 3 };
+  s.r.x = 1;
+   s.r.y = 2;
+   struct R r = s.r;
+  s.z = 3;
+  blah (s);
+}
+
+
+/* { dg-final { scan-tree-dump-times "Deleted dead store" 4 "dse1" } } */
+
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index a807d6d..f5b53fc 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -475,6 +475,41 @@ maybe_trim_partially_dead_store (ao_ref *ref, sbitmap 
live, gimple *stmt)
 }
 }
 
+/* Return TRUE if USE_REF reads bytes from LIVE where live is
+   derived from REF, a write reference.
+
+   While this routine may modify USE_REF, it's passed by value, not
+   location.  So callers do not see those modifications.  */
+
+static bool
+live_bytes_read (ao_ref use_ref, ao_ref *ref, sbitmap live)
+{
+  /* We have already verified that USE_REF and REF hit the same object.
+ Now verify that there's actually an overlap between USE_REF and REF.  */
+  if ((use_ref.offset < ref->offset
+   && use_ref.offset + use_ref.size > ref->offset)
+  || (use_ref.offset >= ref->offset
+ && use_ref.offset < ref->offset + ref->size))
+{
+  normalize_ref (_ref, ref);
+
+  /* If USE_REF covers all of REF, then it will hit one or more
+live bytes.   This avoids useless iteration over the bitmap
+below.  */
+  if (use_ref.offset == ref->offset && use_ref.size == ref->size)
+   return true;
+
+  /* Now iterate over what's left in USE_REF and see if any of
+those bits are i LIVE.  */
+  for (int i = (use_ref.offset - ref->offset) / BITS_PER_UNIT;
+  i < (use_ref.offset + use_ref.size) / BITS_PER_UNIT; i++)
+   if (bitmap_bit_p (live, i))
+ return true;
+  return false;
+}
+  return true;
+}
+
 /* A helper of dse_optimize_stmt.
Given a GIMPLE_ASSIGN in STMT that writes to REF, find a candidate
statement *USE_STMT that may prove STMT to be dead.
@@ -554,6 +589,41 @@ dse_classify_store (ao_ref *ref, gimple *stmt, gimple 
**use_stmt,
  /* If the statement is a use the store is not dead.  */
  else if (ref_maybe_used_by_stmt_p (use_stmt, ref))
{
+ /* Handle common cases where we can easily build a ao_ref
+structure for USE_STMT and in doing so we find that the
+references hit non-live bytes and thus can be ignored.  */
+ if (live_bytes)
+   {
+ if (is_gimple_assign 

[PATCH 0/4] Improve DSE implementation

2016-12-21 Thread Jeff Law

This is V3 of the 4 series patchkit to address various DSE issues.

The various comments from the V2 patchkit have been addressed and I 
believe the net result is cleaner and more compile-time efficient.


The major changes were a move to using sbitmaps, only allowing the live 
sbitmap once per invocation of the DSE optimizer, and more efficient 
trimming computations.


Again, patches #1 and #2 seem appropriate to me at this stage in our 
development cycle.  #3 and #4 are harder to justify.



There's dependencies as we walk forward in the patch kits.  Each patch 
has been bootstrapped & tested with its previous patch(es).


Jeff


[RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE - v3

2016-12-21 Thread Jeff Law
This is the first of the 4 part patchkit to address deficiencies in our 
DSE implementation.


This patch addresses the P2 regression 33562 which has been a low 
priority regression since gcc-4.3.  To summarize, DSE no longer has the 
ability to detect an aggregate store as dead if subsequent stores are 
done in a piecemeal fashion.


I originally tackled this by changing how we lower complex objects. That 
was sufficient to address 33562, but was reasonably rejected.


This version attacks the problem by improving DSE to track stores to 
memory at a byte level.  That allows us to determine if a series of 
stores completely covers an earlier store (thus making the earlier store 
dead).


A useful side effect of this is we can detect when parts of a store are 
dead and potentially rewrite the store.  This patch implements that for 
complex object initializations.  While not strictly part of 33562, it's 
so closely related that I felt it belongs as part of this patch.


This originally limited the size of the tracked memory space to 64 
bytes.  I bumped the limit after working through the CONSTRUCTOR and 
mem* trimming patches.  The 256 byte limit is still fairly arbitrary and 
I wouldn't lose sleep if we throttled back to 64 or 128 bytes.


Later patches in the kit will build upon this patch.  So if pieces look 
like skeleton code, that's because it is.


The changes since the V2 patch are:

1. Using sbitmaps rather than bitmaps.
2. Returning a tri-state from dse_classify_store (renamed from 
dse_possible_dead_store_p)

3. More efficient trim computation
4. Moving trimming code out of dse_classify_store
5. Refactoring code to delete dead calls/assignments
6. dse_optimize_stmt moves into the dse_dom_walker class

Not surprisingly, this patch has most of the changes based on prior 
feedback as it includes the raw infrastructure.


Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?

PR tree-optimization/33562
* params.def (PARM_DSE_MAX_OBJECT_SIZE): New PARAM.
* sbitmap.h (bitmap_clear_range, bitmap_set_range): Prototype new
functions.
(bitmap_count_bits): Likewise.
* sbitmap.c (bitmap_clear_range, bitmap_set_range): New functions.
(bitmap_count_bits): Likewise.
* tree-ssa-dse.c: Include params.h.
(dse_store_status): New enum.
(initialize_ao_ref_for_dse): New, partially extracted from
dse_optimize_stmt.
(valid_ao_ref_for_dse, normalize_ref): New.
(setup_live_bytes_from_ref, compute_trims): Likewise.
(clear_bytes_written_by, trim_complex_store): Likewise.
(maybe_trim_partially_dead_store): Likewise.
(maybe_trim_complex_store): Likewise.
(dse_classify_store): Renamed from dse_possibly_dead_store_p.
Track what bytes live from the original store.  Return tri-state
for dead, partially dead or live.
(dse_dom_walker): Add constructor, destructor and new private members.
(delete_dead_call, delete_dead_assignment): New extracted from
dse_optimize_stmt.
(dse_optimize_stmt): Make a member of dse_dom_walker.
Use initialize_ao_ref_for_dse.


* gcc.dg/tree-ssa/complex-4.c: No longer xfailed.
* gcc.dg/tree-ssa/complex-5.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-9.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-18.c: New test.
* gcc.dg/tree-ssa/ssa-dse-19.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-20.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-21.c: Likewise.

diff --git a/gcc/params.def b/gcc/params.def
index 50f75a7..f367c1d 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -532,6 +532,11 @@ DEFPARAM(PARAM_AVG_LOOP_NITER,
 "Average number of iterations of a loop.",
 10, 1, 0)
 
+DEFPARAM(PARAM_DSE_MAX_OBJECT_SIZE,
+"dse-max-object-size",
+"Maximum size (in bytes) of objects tracked by dead store 
elimination.",
+256, 0, 0)
+
 DEFPARAM(PARAM_SCEV_MAX_EXPR_SIZE,
 "scev-max-expr-size",
 "Bound on size of expressions used in the scalar evolutions analyzer.",
diff --git a/gcc/sbitmap.c b/gcc/sbitmap.c
index 10b4347..2b66a6c 100644
--- a/gcc/sbitmap.c
+++ b/gcc/sbitmap.c
@@ -202,6 +202,39 @@ bitmap_empty_p (const_sbitmap bmap)
   return true;
 }
 
+void
+bitmap_clear_range (sbitmap bmap, unsigned int start, unsigned int count)
+{
+  for (unsigned int i = start; i < start + count; i++)
+bitmap_clear_bit (bmap, i);
+}
+
+void
+bitmap_set_range (sbitmap bmap, unsigned int start, unsigned int count)
+{
+  for (unsigned int i = start; i < start + count; i++)
+bitmap_set_bit (bmap, i);
+}
+
+
+unsigned int
+bitmap_count_bits (const_sbitmap bmap)
+{
+  unsigned int count = 0;
+  for (unsigned int i = 0; i < bmap->size; i++)
+if (bmap->elms[i])
+  {
+# if HOST_BITS_PER_WIDEST_FAST_INT == HOST_BITS_PER_LONG
+   count += __builtin_popcountl (bmap->elms[i]);
+# elif HOST_BITS_PER_WIDEST_FAST_INT == 

[RFA][PR tree-optimization/61912] [PATCH 2/4] Trimming CONSTRUCTOR stores in DSE - V3

2016-12-21 Thread Jeff Law

This is the second patch in the kit to improve our DSE implementation.

This patch recognizes when a CONSTRUCTOR assignment could be trimmed at 
the head or tail because those bytes are dead.


The first implementation of this turned the CONSTRUCTOR into a memset. 
This version actually rewrites the RHS and LHS of the CONSTRUCTOR 
assignment.


You'll note that the implementation computes head and tail trim counts, 
then masks them to an even byte count.  We might even consider masking 
off the two low bits in the counts.  This masking keeps higher 
alignments on the CONSTRUCTOR remnant which helps keep things efficient 
when the CONSTRUCTOR results in a memset call.


This patch hits a lot statically in GCC and the testsuite.  There were 
hundreds of hits in each.


There may be some room for tuning.  Trimming shouldn't ever result in 
poorer performance, but it may also not result in any measurable gain 
(it depends on how much gets trimmed relative to the size of the 
CONSTRUCTOR node and how the CONSTRUCTOR node gets expanded, the 
processor's capabilities for merging stores internally, etc etc).  I 
suspect the main benefit comes when the CONSTRUCTOR collapses down to 
some thing small that gets expanded inline, thus exposing the internals 
to the rest of the optimization pipeline.


We could, in theory, split the CONSTRUCTOR to pick up dead bytes in the 
middle of the CONSTRUCTOR.  I haven't looked to see how applicable that 
is in real code and what the cost/benefit analysis might look like.



Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?
PR tree-optimization/61912
PR tree-optimization/77485
* tree-sra.h: New file.
* ipa-cp.c: Include tree-sra.h
* ipa-prop.h (build_ref_for_offset): Remove prototype.
* tree-ssa-dse.c: Include expr.h and tree-sra.h.
(compute_trims, trim_constructor_store): New functions.
(maybe_trim_partially_dead_store): Call trim_constructor_store.


* g++.dg/tree-ssa/ssa-dse-1.C: New test.
* gcc.dg/tree-ssa/pr30375: Adjust expected output.
* gcc.dg/tree-ssa/ssa-dse-24.c: New test.

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index d3b5052..bc5ea87 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -122,6 +122,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-inline.h"
 #include "ipa-utils.h"
 #include "tree-ssa-ccp.h"
+#include "tree-sra.h"
 
 template  class ipcp_value;
 
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 0e75cf4..6d7b480 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -820,10 +820,6 @@ ipa_parm_adjustment *ipa_get_adjustment_candidate (tree 
**, bool *,
 void ipa_release_body_info (struct ipa_func_body_info *);
 tree ipa_get_callee_param_type (struct cgraph_edge *e, int i);
 
-/* From tree-sra.c:  */
-tree build_ref_for_offset (location_t, tree, HOST_WIDE_INT, bool, tree,
-  gimple_stmt_iterator *, bool);
-
 /* In ipa-cp.c  */
 void ipa_cp_c_finalize (void);
 
diff --git a/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-1.C 
b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-1.C
new file mode 100644
index 000..3f85f3a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-1.C
@@ -0,0 +1,101 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c++14 -O -fdump-tree-dse1-details" } */
+
+using uint = unsigned int;
+
+template
+struct FixBuf
+{
+   C buf[S] = {};
+};
+
+template
+struct OutBuf
+{
+   C*  cur;
+   C*  end;
+   C*  beg;
+
+   template
+   constexpr
+   OutBuf(FixBuf& b) : cur{b.buf}, end{b.buf + S}, beg{b.buf} { }
+
+   OutBuf(C* b, C* e) : cur{b}, end{e} { }
+   OutBuf(C* b, uint s) : cur{b}, end{b + s} { }
+
+   constexpr
+   OutBuf& operator<<(C v)
+   {
+   if (cur < end) {
+   *cur = v;
+   }
+   ++cur;
+   return *this;
+   }
+
+   constexpr
+   OutBuf& operator<<(uint v)
+   {
+   uint q = v / 10U;
+   uint r = v % 10U;
+   if (q) {
+   *this << q;
+   }
+   *this << static_cast(r + '0');
+   return *this;
+   }
+};
+
+template
+struct BufOrSize
+{
+   template
+   static constexpr auto Select(FixBuf& fb, OutBuf&)
+   {
+   return fb;
+   }
+};
+
+template<>
+struct BufOrSize
+{
+   template
+   static constexpr auto Select(FixBuf&, OutBuf& ob)
+   {
+   return ob.cur - ob.beg;
+   }
+};
+
+// if BOS=1, it will return the size of the generated data, else the data 
itself
+template
+constexpr
+auto fixbuf()
+{
+   FixBuf fb;
+   OutBuf ob{fb};
+   for (uint i = 0; i <= N; ++i) {
+   ob << i << static_cast(i == N ? 0 : ' ');
+   }
+   return BufOrSize::Select(fb, ob);
+}
+
+auto foo()
+{
+   constexpr auto x = fixbuf<13, 200>();
+   return 

[PATCH] PR78879

2016-12-21 Thread Yuan, Pengfei
Hi,

The following patch fixes
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78879

There are some other invocations of unordered_remove in
tree-ssa-threadupdate.c, which may also need to be replaced
with ordered_remove.

Regards,
Yuan, Pengfei


2016-12-22  Yuan Pengfei  

PR middle-end/78879
* tree-ssa-threadupdate.c (mark_threaded_blocks): Replace
unordered_remove with ordered_remove.


diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 5a5f8df..b2e6d7a 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2144,7 +2144,7 @@ mark_threaded_blocks (bitmap threaded_blocks)
}
  else
{
- paths.unordered_remove (i);
+ paths.ordered_remove (i);
  if (dump_file && (dump_flags & TDF_DETAILS))
dump_jump_thread_path (dump_file, *path, false);
  delete_jump_thread_path (path);
@@ -2180,7 +2180,7 @@ mark_threaded_blocks (bitmap threaded_blocks)
  else
{
  e->aux = NULL;
- paths.unordered_remove (i);
+ paths.ordered_remove (i);
  if (dump_file && (dump_flags & TDF_DETAILS))
dump_jump_thread_path (dump_file, *path, false);
  delete_jump_thread_path (path);



Re: [PATCH, gcc/MIPS] Add options to disable/enable madd.fmt/msub.fmt instructions

2016-12-21 Thread Paul Hua
Hi,

> +On MIPS targets, set the @option{-mno-unfused-madd4} option by default.
> +On some platform, like Loongson 3A/3B 1000/2000/3000, madd.fmt/msub.fmt is
> +broken, which may which may generate wrong calculator result.

The Loongson 3A/3B 1000/2000/3000 madd.fmt/msub.fmt are fused madd instructions.
Can you try this patch:
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00255.html .

Is this patch testing on trunk? I build the trunk failure for a long time.


[PATCH, gcc/MIPS] Add options to disable/enable madd.fmt/msub.fmt instructions

2016-12-21 Thread Yunqiang Su
[PATCH] Add options to disable/enable madd.fmt/msub.fmt instructions

The build-time options are:
  --with-unfused-madd4=yes/no
  --without-unfused-madd4

The runtime options are:
  -munfused-madd4
  -mno-unfused-madd4

These options are needed due to madd.fmt/msub.fmt on some
platform is broken, which may generate wrong calculator result.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7afbc54bc78..a3fdb0273c0 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3940,7 +3940,7 @@ case "${target}" in
;;
 
mips*-*-*)
-   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci"
+   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 unfused_madd4 tune tune_32 tune_64 divide llsc mips-plt 
synci"
 
case ${with_float} in
"" | soft | hard)
@@ -3997,6 +3997,19 @@ case "${target}" in
exit 1
;;
esac
+   
+   case ${with_unfused_madd4} in
+   "" | yes)
+   with_unfused_madd4="unfused-madd4"
+   ;;
+   no)
+   with_unfused_madd4="no-unfused-madd4"
+   ;;
+   *)
+   echo "Unknown unfused_madd4 type used in 
--with-unfused-madd4=$with_unfused_madd4" 1>&2
+   exit 1
+   ;;
+   esac
 
case ${with_abi} in
"" | 32 | o64 | n32 | 64 | eabi)
@@ -4496,7 +4509,7 @@ case ${target} in
 esac
 
 t=
-all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 
schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls"
+all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 
schedule float mode fpu nan fp_32 odd_spreg_32 unfused_madd4 divide llsc 
mips-plt synci tls"
 for option in $all_defaults
 do
eval "val=\$with_"`echo $option | sed s/-/_/g`
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index c7eb2a8e7bd..c2c231a483c 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -19829,6 +19829,16 @@ mips_option_override (void)
  will be produced.  */
   target_flags |= MASK_ODD_SPREG;
 }
+  
+  /* If neither -munfused-madd nor -mno-unfused-madd was given on the command
+ line, set MASK_UNFSUED_MADD based on the ISA.  */
+  if ((target_flags_explicit & MASK_UNFUSED_MADD4) == 0)
+{
+  if (!ISA_HAS_UNFUSED_MADD4)
+   target_flags &= ~MASK_UNFUSED_MADD4;
+  else
+   target_flags |= MASK_UNFUSED_MADD4;
+}
 
   if (!ISA_HAS_COMPACT_BRANCHES && mips_cb == MIPS_CB_ALWAYS)
 {
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index bb66c428dd1..6cf571906a6 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -863,6 +863,7 @@ struct mips_cpu_info {

":%{!msoft-float:%{!msingle-float:%{!mfp*:%{!mmsa:-mfp%(VALUE)}" }, \
   {"odd_spreg_32", "%{" OPT_ARCH32 ":%{!msoft-float:%{!msingle-float:" \
   "%{!modd-spreg:%{!mno-odd-spreg:-m%(VALUE)}" }, \
+  {"unfused_madd4", "%{!munfused-madd4:%{!mno-unfused-madd4:-m%(VALUE)}}" }, \
   {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}" }, \
   {"llsc", "%{!mllsc:%{!mno-llsc:-m%(VALUE)}}" }, \
   {"mips-plt", "%{!mplt:%{!mno-plt:-m%(VALUE)}}" }, \
@@ -1060,7 +1061,7 @@ struct mips_cpu_info {
 
 /* ISA has 4 operand unfused madd instructions of the form
'd = [+-] (a * b [+-] c)'.  */
-#define ISA_HAS_UNFUSED_MADD4  (ISA_HAS_FP4 && !TARGET_MIPS8000)
+#define ISA_HAS_UNFUSED_MADD4  (ISA_HAS_FP4 && !TARGET_MIPS8000 && 
TARGET_UNFUSED_MADD4)
 
 /* ISA has 3 operand r6 fused madd instructions of the form
'c = c [+-] (a * b)'.  */
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index 08dd83e14ce..b154752a0df 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -416,6 +416,10 @@ modd-spreg
 Target Report Mask(ODD_SPREG)
 Enable use of odd-numbered single-precision registers.
 
+munfused-madd4
+Target Report Mask(UNFUSED_MADD4)
+Enable unfused multiply-add/multiply-sub instruction, aka madd.fmt/msub.fmt.
+
 mframe-header-opt
 Target Report Var(flag_frame_header_optimization) Optimization
 Optimize frame header.
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index b911d76dd66..00501c88420 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1320,6 +1320,16 @@ On MIPS targets, set the @option{-mno-odd-spreg} option 
by default when using
 the o32 ABI.  This is normally used in conjunction with
 @option{--with-fp-32=64} in order to target the o32 FP64A ABI extension.
 
+@item --with-unfused-madd4
+On MIPS targets, set the @option{-munfused-madd4} option by default.
+On some platform, like Loongson 3A/3B 1000/2000/3000, madd.fmt/msub.fmt is
+broken, which 

Re: Pointer Bounds Checker and trailing arrays (PR68270)

2016-12-21 Thread Ilya Enkovich
2016-12-21 22:18 GMT+03:00 Alexander Ivchenko :
> Right.. here is this updated chunk (otherwise no difference in the patch)
>
> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
> index 2769682..6c7862c 100644
> --- a/gcc/tree-chkp.c
> +++ b/gcc/tree-chkp.c
> @@ -3272,6 +3272,9 @@ chkp_may_narrow_to_field (tree field)
>  {
>return DECL_SIZE (field) && TREE_CODE (DECL_SIZE (field)) == INTEGER_CST
>  && tree_to_uhwi (DECL_SIZE (field)) != 0
> +&& !(flag_chkp_flexible_struct_trailing_arrays
> + && TREE_CODE(TREE_TYPE(field)) == ARRAY_TYPE
> + && !DECL_CHAIN (field))
>  && (!DECL_FIELD_OFFSET (field)
>   || TREE_CODE (DECL_FIELD_OFFSET (field)) == INTEGER_CST)
>  && (!DECL_FIELD_BIT_OFFSET (field)

OK.

>
> 2016-12-21 21:00 GMT+03:00 Ilya Enkovich :
>> 2016-12-20 17:44 GMT+03:00 Alexander Ivchenko :
>>> 2016-11-26 0:28 GMT+03:00 Ilya Enkovich :
 2016-11-25 15:47 GMT+03:00 Alexander Ivchenko :
> Hi,
>
> The patch below addresses PR68270. could you please take a look?
>
> 2016-11-25  Alexander Ivchenko  
>
>* c-family/c.opt (flag_chkp_flexible_struct_trailing_arrays):
>Add new option.
>* tree-chkp.c (chkp_parse_array_and_component_ref): Forbid
>narrowing when chkp_parse_array_and_component_ref is used and
>the ARRAY_REF points to an array in the end of the struct.
>
>
>
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 7d8a726..e45d6a2 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1166,6 +1166,11 @@ C ObjC C++ ObjC++ LTO RejectNegative Report
> Var(flag_chkp_narrow_to_innermost_ar
>  Forces Pointer Bounds Checker to use bounds of the innermost arrays in 
> case of
>  nested static arryas access.  By default outermost array is used.
>
> +fchkp-flexible-struct-trailing-arrays
> +C ObjC C++ ObjC++ LTO RejectNegative Report
> Var(flag_chkp_flexible_struct_trailing_arrays)
> +Allow Pointer Bounds Checker to treat all trailing arrays in structures 
> as
> +possibly flexible.

 Words 'allow' and 'possibly' are confusing here. This option is about to 
 force
 checker to do something, not to give him a choice.
>>>
>>> Fixed
>>>
 New option has to be documented in invoke.texi. It would also be nice to 
 reflect
 changes on GCC MPX wiki page.
>>>
>>> Done
 We have an attribute to change compiler behavior when this option is not 
 set.
 But we have no way to make exceptions when this option is used. Should we
 add one?
>>> Something like "bnd_fixed_size" ? Could work. Although the customer
>>> request did not mention the need for that.
>>> Can I add it in a separate patch?
>>>
>>
>> Yes.
>>
>>>
> +
>  fchkp-optimize
>  C ObjC C++ ObjC++ LTO Report Var(flag_chkp_optimize) Init(-1)
>  Allow Pointer Bounds Checker optimizations.  By default allowed
> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
> index 2769682..40f99c3 100644
> --- a/gcc/tree-chkp.c
> +++ b/gcc/tree-chkp.c
> @@ -3425,7 +3425,9 @@ chkp_parse_array_and_component_ref (tree node, tree 
> *ptr,
>if (flag_chkp_narrow_bounds
>&& !flag_chkp_narrow_to_innermost_arrray
>&& (!last_comp
> -  || chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1
> +  || (chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1))
> +  && !(flag_chkp_flexible_struct_trailing_arrays
> +   && array_at_struct_end_p (var)

 This is incorrect place for fix. Consider code

 struct S {
   int a;
   int b[10];
 };

 struct S s;
 int *p = s.b;

 Here you need to compute bounds for p and you want your option to take 
 effect
 but in this case you won't event reach your new check because there is no
 ARRAY_REF. And even if we change it to

 int *p = s.b[5];

 then it still would be narrowed because s.b would still be written
 into 'comp_to_narrow'
 variable. Correct place for fix is in chkp_may_narrow_to_field.
>>>
>>> Done
>>>
 Also you should consider fchkp-narrow-to-innermost-arrray option. Should it
 be more powerfull or not? I think fchkp-narrow-to-innermost-arrray 
 shouldn't
 narrow to variable sized fields. BTW looks like right now bnd_variable_size
 attribute is ignored by fchkp-narrow-to-innermost-arrray. This is another
 problem and may be fixed in another patch though.
>>> The way code works in chkp_parse_array_and_component_ref seems to be
>>> exactly like you say:  fchkp-narrow-to-innermost-arrray won't narrow
>>> to variable sized fields. I will create a separate bug for
>>> bnd_variable_size+ fchkp-narrow-to-innermost-arrray.
>>>
 Also patch lacks tests for various situations (with option and 

Re: [PATCH] Replace DW_FORM_ref_sup with DW_FORM_ref_sup{4,8}

2016-12-21 Thread Jason Merrill
OK.

On Tue, Dec 20, 2016 at 1:57 PM, Jakub Jelinek  wrote:
> Hi!
>
> Recently DW_FORM_ref_sup (which is meant e.g. for dwz, gcc doesn't emit it)
> has been renamed to DW_FORM_ref_sup4 (and changed so that it is always 4
> byte) and DW_FORM_ref_sup8 (always 8 byte) has been added.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-12-20  Jakub Jelinek  
>
> * dwarf2.def (DW_FORM_ref_sup): Renamed to ...
> (DW_FORM_ref_sup4): ... this.  New form.
> (DW_FORM_ref_sup8): New form.
>
> --- include/dwarf2.def.jj   2016-10-31 13:28:05.0 +0100
> +++ include/dwarf2.def  2016-12-19 11:40:07.303525953 +0100
> @@ -212,13 +212,14 @@ DW_FORM (DW_FORM_ref_sig8, 0x20)
>  /* DWARF 5.  */
>  DW_FORM (DW_FORM_strx, 0x1a)
>  DW_FORM (DW_FORM_addrx, 0x1b)
> -DW_FORM (DW_FORM_ref_sup, 0x1c)
> +DW_FORM (DW_FORM_ref_sup4, 0x1c)
>  DW_FORM (DW_FORM_strp_sup, 0x1d)
>  DW_FORM (DW_FORM_data16, 0x1e)
>  DW_FORM (DW_FORM_line_strp, 0x1f)
>  DW_FORM (DW_FORM_implicit_const, 0x21)
>  DW_FORM (DW_FORM_loclistx, 0x22)
>  DW_FORM (DW_FORM_rnglistx, 0x23)
> +DW_FORM (DW_FORM_ref_sup8, 0x24)
>  /* Extensions for Fission.  See http://gcc.gnu.org/wiki/DebugFission.  */
>  DW_FORM (DW_FORM_GNU_addr_index, 0x1f01)
>  DW_FORM (DW_FORM_GNU_str_index, 0x1f02)
>
> Jakub


Re: [C++ PATCH] Error on shadowing a parameter by anon union member in the same scope (PR c++/72707)

2016-12-21 Thread Jason Merrill
OK.

On Tue, Dec 20, 2016 at 2:01 PM, Jakub Jelinek  wrote:
> Hi!
>
> DECL_ANON_UNION_VAR_P vars are DECL_ARTIFICIAL, but we still to diagnose
> them if they shadow something.  The DECL_ARTIFICIAL (x) check has been
> missing in older gcc releases, so we diagnosed that properly.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-12-20  Jakub Jelinek  
>
> PR c++/72707
> * name-lookup.c (pushdecl_maybe_friend_1): Do check shadowing of
> artificial x if it is an anonymous union variable.
>
> * g++.dg/warn/Wshadow-12.C: New test.
>
> --- gcc/cp/name-lookup.c.jj 2016-11-30 08:57:21.0 +0100
> +++ gcc/cp/name-lookup.c2016-12-20 14:42:35.482232062 +0100
> @@ -,8 +,10 @@ pushdecl_maybe_friend_1 (tree x, bool is
> || TREE_CODE (x) == TYPE_DECL)))
> /* Don't check for internally generated vars unless
>it's an implicit typedef (see create_implicit_typedef
> -  in decl.c).  */
> -  && (!DECL_ARTIFICIAL (x) || DECL_IMPLICIT_TYPEDEF_P (x)))
> +  in decl.c) or anonymous union variable.  */
> +  && (!DECL_ARTIFICIAL (x)
> +  || DECL_IMPLICIT_TYPEDEF_P (x)
> +  || (VAR_P (x) && DECL_ANON_UNION_VAR_P (x
> {
>   bool nowarn = false;
>
> --- gcc/testsuite/g++.dg/warn/Wshadow-12.C.jj   2016-12-20 14:41:28.083114644 
> +0100
> +++ gcc/testsuite/g++.dg/warn/Wshadow-12.C  2016-12-20 14:40:19.0 
> +0100
> @@ -0,0 +1,9 @@
> +// PR c++/72707
> +// { dg-do compile }
> +
> +void
> +foo (double x)
> +{
> +  union { int x; };// { dg-error "shadows a parameter" }
> +  x = 0;
> +}
>
> Jakub


patch to fix PR78580

2016-12-21 Thread Vladimir N Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78580

The patch was successfully tested and bootstrapped on x86-64.

Committed to the trunk as rev. 243875.


Index: ChangeLog
===
--- ChangeLog	(revision 243873)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2016-12-21  Vladimir Makarov  
+
+	PR rtl-optimization/78580
+	* ira-costs.c (find_costs_and_classes): Make regno_aclass
+	translated into an allocno class.
+
 2016-12-21  Pat Haugen  
 
 	PR rtl-optimization/11488
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 243873)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2016-12-21  Vladimir Makarov  
+
+	PR rtl-optimization/78580
+	* gcc.target/i386/pr78580.c: New.
+
 2016-12-21  Jakub Jelinek  
 
 	PR c++/77830
Index: ira-costs.c
===
--- ira-costs.c	(revision 243533)
+++ ira-costs.c	(working copy)
@@ -1846,14 +1846,19 @@ find_costs_and_classes (FILE *dump_file)
 	   short in -O0 code and so register pressure tends to be low.
 
 	   Avoid that by ignoring the alternative class if the best
-	   class has plenty of registers.  */
-	regno_aclass[i] = best;
+	   class has plenty of registers.
+
+	   The union class arrays give important classes and only
+	   part of it are allocno classes.  So translate them into
+	   allocno classes.  */
+	regno_aclass[i] = ira_allocno_class_translate[best];
 	  else
 	{
 	  /* Make the common class the biggest class of best and
-		 alt_class.  */
-	  regno_aclass[i]
-		= ira_reg_class_superunion[best][alt_class];
+		 alt_class.  Translate the common class into an
+		 allocno class too.  */
+	  regno_aclass[i] = (ira_allocno_class_translate
+ [ira_reg_class_superunion[best][alt_class]]);
 	  ira_assert (regno_aclass[i] != NO_REGS
 			  && ira_reg_allocno_class_p[regno_aclass[i]]);
 	}
Index: testsuite/gcc.target/i386/pr78580.c
===
--- testsuite/gcc.target/i386/pr78580.c	(nonexistent)
+++ testsuite/gcc.target/i386/pr78580.c	(working copy)
@@ -0,0 +1,18 @@
+/* PR rtl-optimization/78580 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -ffixed-ebx" } */
+
+extern const signed char a;
+
+int
+foo (signed char x)
+{
+  return x;
+}
+
+int
+main ()
+{
+  foo (a);
+  return 0;
+}


Re: [PATCH] fix powerpc64le bootstrap failure caused by r243661 (PR 78817)

2016-12-21 Thread Jakub Jelinek
On Wed, Dec 21, 2016 at 03:13:29PM -0700, Jeff Law wrote:
> > > Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically 
> > > in
> > > a correct position, but it might be more maintainable long term if the
> > > allocation/deallocation occur at the same nesting level.
> > 
> > The only case where it makes a difference is where the bitmap is NULL and
> > there is nothing to free (and that is the common case).
> Yes, I know.  It's more a style issue -- having the allocation/deallocation
> at the same scoping level is simply easier for humans to process and thus
> makes it much less likely for someone to muck it up later (particularly if
> that routine grows).

But in this case the allocation is just in a different routine, and is only
conditional, and the deallocation is conditional as well.

Jakub


Re: [PATCH] fix powerpc64le bootstrap failure caused by r243661 (PR 78817)

2016-12-21 Thread Jeff Law

On 12/21/2016 03:11 PM, Jakub Jelinek wrote:

On Wed, Dec 21, 2016 at 02:47:49PM -0700, Jeff Law wrote:

It looks like you could avoid a lot of work in pass_post_ipa_warn::execute
by checking if warnings were asked for outside the main loop.  Presumably
you wrote this with the check inside the loop with the expectation that
other warnings might move into this routine, right?


I had it in mind for the -Walloc-zero warning, yes.
And the gate checks whether the warnings are requested:
+  virtual bool gate (function *) { return warn_nonnull != 0; }
If we add further warnings to this pass, they would be added to the main
loop and gate.

Ah, missed that in the gate.





Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically in
a correct position, but it might be more maintainable long term if the
allocation/deallocation occur at the same nesting level.


The only case where it makes a difference is where the bitmap is NULL and
there is nothing to free (and that is the common case).
Yes, I know.  It's more a style issue -- having the 
allocation/deallocation at the same scoping level is simply easier for 
humans to process and thus makes it much less likely for someone to muck 
it up later (particularly if that routine grows).


Jeff


Re: [PATCH] fix powerpc64le bootstrap failure caused by r243661 (PR 78817)

2016-12-21 Thread Jakub Jelinek
On Wed, Dec 21, 2016 at 02:47:49PM -0700, Jeff Law wrote:
> It looks like you could avoid a lot of work in pass_post_ipa_warn::execute
> by checking if warnings were asked for outside the main loop.  Presumably
> you wrote this with the check inside the loop with the expectation that
> other warnings might move into this routine, right?

I had it in mind for the -Walloc-zero warning, yes.
And the gate checks whether the warnings are requested:
+  virtual bool gate (function *) { return warn_nonnull != 0; }
If we add further warnings to this pass, they would be added to the main
loop and gate.

> Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically in
> a correct position, but it might be more maintainable long term if the
> allocation/deallocation occur at the same nesting level.

The only case where it makes a difference is where the bitmap is NULL and
there is nothing to free (and that is the common case).

> OK as-is or with the BITMAP_FREE call moved to the same scoping level as
> get_nonnull_args.

Thanks.

Jakub


Re: [PATCH] fix powerpc64le bootstrap failure caused by r243661 (PR 78817)

2016-12-21 Thread Jeff Law

On 12/16/2016 09:41 AM, Jakub Jelinek wrote:

On Fri, Dec 16, 2016 at 11:08:00AM +0100, Jakub Jelinek wrote:

Here is an untested proof of concept for:
1) keeping the warning in the FEs no matter what optimization level is on,
   just making sure TREE_NO_WARNING is set on the CALL_EXPR if we've warned
2) moving the rest of the warning shortly post IPA, when we have performed
   inlining already and some constant propagation afterwards, but where
   hopefully the IL still isn't too much different from the original source
3) as the nonnull attribute is a type property, it warns about the function
   type of the call, doesn't require a fndecl
The tree-ssa-ccp.c location is just randomly chosen, the pass could go
into its own file, or some other file.  And I think e.g. the -Walloc-zero
warning should move there as well.

If you think warning later can be still useful to some users at the expense
of higher false positive rate, we could have -Wmaybe-nonnull warning that
would guard that and set the gimple no warning flag when we warn in the
pass.

If needed, there is always the option on the table to turn
TREE_NO_WARNING/gimple_no_warning_p into a bit that says on the side hash
table contains bitmap of disabled warnings for the expression or statement.
IMHO we want to do that in any case, just not sure if it is urgent to do for
GCC 7.


Now successfully bootstrapped/regtested on x86_64-linux and i686-linux, so I
wrote ChangeLog for it as well:

2016-12-16  Jakub Jelinek  

PR bootstrap/78817
* tree-pass.h (make_pass_post_ipa_warn): Declare.
* builtins.c (validate_arglist): Adjust get_nonnull_args call.
Check for NULL pointer argument to nonnull arg here.
(validate_arg): Revert 2016-12-14 changes.
* calls.h (get_nonnull_args): Remove declaration.
* tree-ssa-ccp.c: Include diagnostic-core.h.
(pass_data_post_ipa_warn): New variable.
(pass_post_ipa_warn): New class.
(pass_post_ipa_warn::execute): New method.
(make_pass_post_ipa_warn): New function.
* tree.h (get_nonnull_args): Declare.
* tree.c (get_nonnull_args): New function.
* calls.c (maybe_warn_null_arg): Removed.
(maybe_warn_null_arg): Removed.
(initialize_argument_information): Revert 2016-12-14 changes.
* passes.def: Add pass_post_ipa_warn after first ccp after IPA.
c-family/
* c-common.c (struct nonnull_arg_ctx): New type.
(check_function_nonnull): Return bool instead of void.  Use
nonnull_arg_ctx as context rather than just location_t.
(check_nonnull_arg): Adjust for the new context type, set
warned_p to true if a warning has been diagnosed.
(check_function_arguments): Return bool instead of void.
* c-common.h (check_function_arguments): Adjust prototype.
c/
* c-typeck.c (build_function_call_vec): If check_function_arguments
returns true, set TREE_NO_WARNING on CALL_EXPR.
cp/
* typeck.c (cp_build_function_call_vec): If check_function_arguments
returns true, set TREE_NO_WARNING on CALL_EXPR.
* call.c (build_over_call): Likewise.
So I spoke with Martin yesterday and have been convinced that we ought 
to go forward now rather than waiting for gcc-8.  Essentially the 
argument is that Jakub's patch is a significant improvement over where 
the warnings were in prior GCC releases, even if they don't go as far as 
Martin's work.


We can (and should) evaluate whether or not to push things further in gcc-8.


So with that in mind...

It looks like you could avoid a lot of work in 
pass_post_ipa_warn::execute by checking if warnings were asked for 
outside the main loop.  Presumably you wrote this with the check inside 
the loop with the expectation that other warnings might move into this 
routine, right?


Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically 
in a correct position, but it might be more maintainable long term if 
the allocation/deallocation occur at the same nesting level.




OK as-is or with the BITMAP_FREE call moved to the same scoping level as 
get_nonnull_args.


Jeff



RE: [Patch ,gcc/MIPS] add an build-time/runtime option to disable madd.fmt

2016-12-21 Thread Matthew Fortune
Sandra Loosemore  writes:
> On 12/21/2016 11:54 AM, Yunqiang Su wrote:
> > By this patch, I add a build-time option ` --with-unfused-madd4=yes/no',
> > and runtime option -m(no-)unfused-madd4,
> > to disable generate madd.fmt instructions.
> 
> Your patch also needs a documentation change so that the new
> command-line option is listed in the GCC manual with other MIPS target
> options.

Any opinions on option names to control this? Is it best to target the specific
feature that is non-compliant on loongson or apply a general -mfix-loongson
type option?

I'm not sure I have a strong opinion either way but there do seem to be
multiple possible variants.

Thanks,
Matthew




Re: [Patch ,gcc/MIPS] add an build-time/runtime option to disable madd.fmt

2016-12-21 Thread Sandra Loosemore

On 12/21/2016 11:54 AM, Yunqiang Su wrote:

By this patch, I add a build-time option ` --with-unfused-madd4=yes/no’,
and runtime option -m(no-)unfused-madd4,
to disable generate madd.fmt instructions.


Your patch also needs a documentation change so that the new 
command-line option is listed in the GCC manual with other MIPS target 
options.


-Sandra




C++ PATCH for c++/42329, P0522 and other template template parm issues

2016-12-21 Thread Jason Merrill
The first patch fixes some issues that I noticed in implementing P0522
with my earlier auto non-type parameters work.

The second patch uses deduction to confirm that a partial
specialization is more specialized than the primary template, which
catches some bugs in the testsuite that the existing checks didn't.

The third patch fixes an issue that came up in the context of P0522
with explicit arguments and variadic templates: we were incorrectly
rejecting a template template-argument because of the trailing
expansion of pack elements yet to be deduced.  This is incorrect
because they can be deduced to an empty set, making the binding
well-formed.  This fix required a change to the libstdc++ testsuite,
as the test_property overloads become ambiguous.

The fourth patch fixes 42329, where we were failing to bind a template
template-parameter to a base class of the argument type.

The last patch implements paper P0522, which resolves DR150 to clarify
that default arguments do make a template suitable as an argument to a
template template-parameter based on a new rule that the
template-parameter must be more specialized (as newly defined) than
the argument template.  This is a defect report that will apply to all
standard levels, but since we're in stage3 I limited it by default to
C++17 for GCC 7; it can also be explicitly enabled with
-fnew-ttp-matching.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 8c2a8bda53668122de62e49086b38876cde747b6
Author: Jason Merrill 
Date:   Mon Dec 5 11:46:13 2016 -0500

Fixes for P0127R2 implementation.

* pt.c (convert_template_argument): Pass args to do_auto_deduction.
(mark_template_parm): Handle deducibility from type of non-type
argument here.
(for_each_template_parm_r): Not here.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 91178ea..9d9c35e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4467,6 +4467,15 @@ mark_template_parm (tree t, void* data)
   tpd->arg_uses_template_parms[tpd->current_arg] = 1;
 }
 
+  /* In C++17 the type of a non-type argument is a deduced context.  */
+  if (cxx_dialect >= cxx1z
+  && TREE_CODE (t) == TEMPLATE_PARM_INDEX)
+for_each_template_parm (TREE_TYPE (t),
+   _template_parm,
+   data,
+   NULL,
+   /*include_nondeduced_p=*/false);
+
   /* Return zero so that for_each_template_parm will continue the
  traversal of the tree; we want to mark *every* template parm.  */
   return 0;
@@ -7328,14 +7337,16 @@ convert_template_argument (tree parm,
 }
   else
 {
-  tree t = tsubst (TREE_TYPE (parm), args, complain, in_decl);
+  tree t = TREE_TYPE (parm);
 
   if (tree a = type_uses_auto (t))
{
- t = do_auto_deduction (t, arg, a, complain, adc_unspecified);
+ t = do_auto_deduction (t, arg, a, complain, adc_unify, args);
  if (t == error_mark_node)
return error_mark_node;
}
+  else
+   t = tsubst (t, args, complain, in_decl);
 
   if (invalid_nontype_parm_type_p (t, complain))
return error_mark_node;
@@ -8956,12 +8967,6 @@ for_each_template_parm_r (tree *tp, int *walk_subtrees, 
void *d)
return t;
   else if (!fn)
return t;
-
-  /* In C++17 we can deduce a type argument from the type of a non-type
-argument.  */
-  if (cxx_dialect >= cxx1z
- && TREE_CODE (t) == TEMPLATE_PARM_INDEX)
-   WALK_SUBTREE (TREE_TYPE (t));
   break;
 
 case TEMPLATE_DECL:
commit e41c94be9f4303fd6714ffd594d1864a3b2255a3
Author: Jason Merrill 
Date:   Wed Dec 21 14:17:56 2016 -0500

Check that a partial specialization is more specialized.

* pt.c (process_partial_specialization): Use
get_partial_spec_bindings to check that the partial specialization
is more specialized than the primary template.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 9d9c35e..8abbcfb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4606,9 +4606,20 @@ process_partial_specialization (tree decl)
 "primary template because it replaces multiple parameters "
 "with a pack expansion");
   inform (DECL_SOURCE_LOCATION (maintmpl), "primary template here");
+  /* Avoid crash in process_partial_specialization.  */
   return decl;
 }
 
+  /* If we aren't in a dependent class, we can actually try deduction.  */
+  else if (tpd.level == 1
+  && !get_partial_spec_bindings (maintmpl, maintmpl, specargs))
+{
+  if (permerror (input_location, "partial specialization %qD is not "
+"more specialized than", decl))
+   inform (DECL_SOURCE_LOCATION (maintmpl), "primary template %qD",
+   maintmpl);
+}
+
   /* [temp.class.spec]
 
  A partially specialized non-type argument expression shall not
diff --git 

[C++ PATCH] c++/61636 generic lambdas and this capture

2016-12-21 Thread Nathan Sidwell
This patch addresses bug 61636, which is an ICE during generic lambda 
instantiation due to unexpected this capture.


The problem is that a generic lambda's closure type is not a template, 
only the function operator is.  Thus this capture has to be determined 
at parsing time (and anyway, it'd be a weird instantiation if this 
capture was determined so late).


However, in a call with dependent arguments, overload resolution is 
deferred until instantiation time.  The patch simply adds a call to 
maybe_resolve_dummy when building the call expression, where lookup 
found a set of member functions.  (I was wrong about koenig lookup 
happening later, because:

/* Do not do argument dependent lookup if regular
   lookup finds a member function or a block-scope
   function declaration.  [basic.lookup.argdep]/3  */

However, the std doesn't discuss this case, nor say how hard the 
compiler has to go to determine whether this capture might be needed. 
For instance, it could go as far as checking that there exists at least 
one non-static member function with an acceptable number of arguments. 
I don't do that here -- perhaps a DR is needed? From Adam's data it this 
is at least as good as Clang.


In template instantiation, we usually check in a dependent base to give 
good diagnostics and try and recover.  But that's another path to the 
same ICE and it seemed to me better to just give an error for this case, 
as there can't be legacy code using it.


ok?

nathan
--
Nathan Sidwell
2016-12-21  Nathan Sidwell  

	cp/
	PR c++/61636
	* parser.c (cp_parser_postfix_expression): Speculatively capture
	this in generic lambda in unresolved member function call.
	* pt.c (tsubst_copy_and_build): Force hard error from failed
	member function lookup in generic lambda.


2016-12-21  Nathan Sidwell  
	Adam Butcher  
	testsuite/

	PR c++/61636
	* g++.dg/cpp1y/pr61636-1.C: New.
	* g++.dg/cpp1y/pr61636-2.C: New.
	* g++.dg/cpp1y/pr61636-3.C: New.

Index: cp/parser.c
===
--- cp/parser.c	(revision 243804)
+++ cp/parser.c	(working copy)
@@ -6971,6 +6971,14 @@ cp_parser_postfix_expression (cp_parser
 			|| type_dependent_expression_p (fn)
 			|| any_type_dependent_arguments_p (args)))
 		  {
+		/* In a generic lambda we now speculatively
+		   capture this, because the eventual call might
+		   need it, and it'll be too late then.  The std
+		   doesn't specify exactly how hard we have to
+		   work to figure out whether we need to capture
+		   in this case.  */
+		maybe_resolve_dummy (instance, true);
+
 		postfix_expression
 		  = build_nt_call_vec (postfix_expression, args);
 		release_tree_vector (args);
Index: cp/pt.c
===
--- cp/pt.c	(revision 243804)
+++ cp/pt.c	(working copy)
@@ -16896,19 +16896,34 @@ tsubst_copy_and_build (tree t,
 
 		if (unq != function)
 		  {
-		tree fn = unq;
-		if (INDIRECT_REF_P (fn))
-		  fn = TREE_OPERAND (fn, 0);
-		if (TREE_CODE (fn) == COMPONENT_REF)
-		  fn = TREE_OPERAND (fn, 1);
-		if (is_overloaded_fn (fn))
-		  fn = get_first_fn (fn);
-		if (permerror (EXPR_LOC_OR_LOC (t, input_location),
-   "%qD was not declared in this scope, "
-   "and no declarations were found by "
-   "argument-dependent lookup at the point "
-   "of instantiation", function))
+		/* In a lambda fn, we have to be careful to not
+		   introduce new this captures.  Legacy code can't
+		   be using lambdas anyway, so it's ok to be
+		   stricter.  */
+		bool in_lambda = (current_class_type
+  && LAMBDA_TYPE_P (current_class_type));
+		char const *msg = "%qD was not declared in this scope, "
+		  "and no declarations were found by "
+		  "argument-dependent lookup at the point "
+		  "of instantiation";
+
+		bool diag = true;
+		if (in_lambda)
+		  error_at (EXPR_LOC_OR_LOC (t, input_location),
+msg, function);
+		else
+		  diag = permerror (EXPR_LOC_OR_LOC (t, input_location),
+	msg, function);
+		if (diag)
 		  {
+			tree fn = unq;
+			if (INDIRECT_REF_P (fn))
+			  fn = TREE_OPERAND (fn, 0);
+			if (TREE_CODE (fn) == COMPONENT_REF)
+			  fn = TREE_OPERAND (fn, 1);
+			if (is_overloaded_fn (fn))
+			  fn = get_first_fn (fn);
+
 			if (!DECL_P (fn))
 			  /* Can't say anything more.  */;
 			else if (DECL_CLASS_SCOPE_P (fn))
@@ -16931,7 +16946,13 @@ tsubst_copy_and_build (tree t,
 			  inform (DECL_SOURCE_LOCATION (fn),
   "%qD declared here, later in the "
   "translation unit", fn);
+			if (in_lambda)
+			  {
+			release_tree_vector (call_args);
+			RETURN (error_mark_node);
+			  }
 		  }
+
 		function = unq;
 		  }
 	  }
Index: testsuite/g++.dg/cpp1y/pr61636-1.C

Re: Pointer Bounds Checker and trailing arrays (PR68270)

2016-12-21 Thread Alexander Ivchenko
Right.. here is this updated chunk (otherwise no difference in the patch)

diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index 2769682..6c7862c 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -3272,6 +3272,9 @@ chkp_may_narrow_to_field (tree field)
 {
   return DECL_SIZE (field) && TREE_CODE (DECL_SIZE (field)) == INTEGER_CST
 && tree_to_uhwi (DECL_SIZE (field)) != 0
+&& !(flag_chkp_flexible_struct_trailing_arrays
+ && TREE_CODE(TREE_TYPE(field)) == ARRAY_TYPE
+ && !DECL_CHAIN (field))
 && (!DECL_FIELD_OFFSET (field)
  || TREE_CODE (DECL_FIELD_OFFSET (field)) == INTEGER_CST)
 && (!DECL_FIELD_BIT_OFFSET (field)

2016-12-21 21:00 GMT+03:00 Ilya Enkovich :
> 2016-12-20 17:44 GMT+03:00 Alexander Ivchenko :
>> 2016-11-26 0:28 GMT+03:00 Ilya Enkovich :
>>> 2016-11-25 15:47 GMT+03:00 Alexander Ivchenko :
 Hi,

 The patch below addresses PR68270. could you please take a look?

 2016-11-25  Alexander Ivchenko  

* c-family/c.opt (flag_chkp_flexible_struct_trailing_arrays):
Add new option.
* tree-chkp.c (chkp_parse_array_and_component_ref): Forbid
narrowing when chkp_parse_array_and_component_ref is used and
the ARRAY_REF points to an array in the end of the struct.



 diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
 index 7d8a726..e45d6a2 100644
 --- a/gcc/c-family/c.opt
 +++ b/gcc/c-family/c.opt
 @@ -1166,6 +1166,11 @@ C ObjC C++ ObjC++ LTO RejectNegative Report
 Var(flag_chkp_narrow_to_innermost_ar
  Forces Pointer Bounds Checker to use bounds of the innermost arrays in 
 case of
  nested static arryas access.  By default outermost array is used.

 +fchkp-flexible-struct-trailing-arrays
 +C ObjC C++ ObjC++ LTO RejectNegative Report
 Var(flag_chkp_flexible_struct_trailing_arrays)
 +Allow Pointer Bounds Checker to treat all trailing arrays in structures as
 +possibly flexible.
>>>
>>> Words 'allow' and 'possibly' are confusing here. This option is about to 
>>> force
>>> checker to do something, not to give him a choice.
>>
>> Fixed
>>
>>> New option has to be documented in invoke.texi. It would also be nice to 
>>> reflect
>>> changes on GCC MPX wiki page.
>>
>> Done
>>> We have an attribute to change compiler behavior when this option is not 
>>> set.
>>> But we have no way to make exceptions when this option is used. Should we
>>> add one?
>> Something like "bnd_fixed_size" ? Could work. Although the customer
>> request did not mention the need for that.
>> Can I add it in a separate patch?
>>
>
> Yes.
>
>>
 +
  fchkp-optimize
  C ObjC C++ ObjC++ LTO Report Var(flag_chkp_optimize) Init(-1)
  Allow Pointer Bounds Checker optimizations.  By default allowed
 diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
 index 2769682..40f99c3 100644
 --- a/gcc/tree-chkp.c
 +++ b/gcc/tree-chkp.c
 @@ -3425,7 +3425,9 @@ chkp_parse_array_and_component_ref (tree node, tree 
 *ptr,
if (flag_chkp_narrow_bounds
&& !flag_chkp_narrow_to_innermost_arrray
&& (!last_comp
 -  || chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1
 +  || (chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1))
 +  && !(flag_chkp_flexible_struct_trailing_arrays
 +   && array_at_struct_end_p (var)
>>>
>>> This is incorrect place for fix. Consider code
>>>
>>> struct S {
>>>   int a;
>>>   int b[10];
>>> };
>>>
>>> struct S s;
>>> int *p = s.b;
>>>
>>> Here you need to compute bounds for p and you want your option to take 
>>> effect
>>> but in this case you won't event reach your new check because there is no
>>> ARRAY_REF. And even if we change it to
>>>
>>> int *p = s.b[5];
>>>
>>> then it still would be narrowed because s.b would still be written
>>> into 'comp_to_narrow'
>>> variable. Correct place for fix is in chkp_may_narrow_to_field.
>>
>> Done
>>
>>> Also you should consider fchkp-narrow-to-innermost-arrray option. Should it
>>> be more powerfull or not? I think fchkp-narrow-to-innermost-arrray shouldn't
>>> narrow to variable sized fields. BTW looks like right now bnd_variable_size
>>> attribute is ignored by fchkp-narrow-to-innermost-arrray. This is another
>>> problem and may be fixed in another patch though.
>> The way code works in chkp_parse_array_and_component_ref seems to be
>> exactly like you say:  fchkp-narrow-to-innermost-arrray won't narrow
>> to variable sized fields. I will create a separate bug for
>> bnd_variable_size+ fchkp-narrow-to-innermost-arrray.
>>
>>> Also patch lacks tests for various situations (with option and without, with
>>> ARRAY_REF and without). In case of new attribute and fix for
>>> fchkp-narrow-to-innermost-arrray behavior additional tests are required.
>> I added three tests for flag_chkp_flexible_struct_trailing_arrays
>>
>>
>>

C++ PATCH for c++/78767, ICE with inherited ctor default argument

2016-12-21 Thread Jason Merrill
Calling strip_inheriting_ctors on a FUNCTION_DECL was returning a
TEMPLATE_DECL; we should make sure we return the same tree code that
we started with.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit f7ef6e8372fcb279349576f769fac34db70c87bf
Author: Jason Merrill 
Date:   Tue Dec 20 07:45:02 2016 -0500

PR c++/78767 - ICE with inherited constructor default argument

* method.c (strip_inheriting_ctors): Strip template as appropriate.

diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 73d42b1..a5271a4 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -496,14 +496,18 @@ forward_parm (tree parm)
constructor from a (possibly indirect) base class.  */
 
 tree
-strip_inheriting_ctors (tree fn)
+strip_inheriting_ctors (tree dfn)
 {
   gcc_assert (flag_new_inheriting_ctors);
+  tree fn = dfn;
   while (tree inh = DECL_INHERITED_CTOR (fn))
 {
   inh = OVL_CURRENT (inh);
   fn = inh;
 }
+  if (TREE_CODE (fn) == TEMPLATE_DECL
+  && TREE_CODE (dfn) == FUNCTION_DECL)
+fn = DECL_TEMPLATE_RESULT (fn);
   return fn;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/inh-ctor24.C 
b/gcc/testsuite/g++.dg/cpp0x/inh-ctor24.C
new file mode 100644
index 000..7c1fae0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/inh-ctor24.C
@@ -0,0 +1,15 @@
+// PR c++/78767
+// { dg-do compile { target c++11 } }
+
+template  struct A
+{
+  template 
+  A(U, U = 42);
+};
+
+struct B: A
+{
+  using A::A;
+};
+
+B b(24);


C++ PATCH to ptree.c

2016-12-21 Thread Jason Merrill
A couple of things I was missing from debug_tree output.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1742299fdad0ac883636af88ea364eec5fec2687
Author: Jason Merrill 
Date:   Thu Dec 15 10:01:05 2016 -0500

Improve C++ debug_tree.

* ptree.c (cxx_print_type): Print args of
BOUND_TEMPLATE_TEMPLATE_PARM.
(cxx_print_decl): Print DECL_TEMPLATE_PARMS.

diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index e3e5e33..67a4ee8 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -50,6 +50,7 @@ cxx_print_decl (FILE *file, tree node, int indent)
 }
   else if (TREE_CODE (node) == TEMPLATE_DECL)
 {
+  print_node (file, "parms", DECL_TEMPLATE_PARMS (node), indent + 4);
   indent_to (file, indent + 3);
   fprintf (file, " full-name \"%s\"",
   decl_as_string (node, TFF_TEMPLATE_HEADER));
@@ -73,9 +74,12 @@ cxx_print_type (FILE *file, tree node, int indent)
 {
   switch (TREE_CODE (node))
 {
+case BOUND_TEMPLATE_TEMPLATE_PARM:
+  print_node (file, "args", TYPE_TI_ARGS (node), indent + 4);
+  gcc_fallthrough ();
+
 case TEMPLATE_TYPE_PARM:
 case TEMPLATE_TEMPLATE_PARM:
-case BOUND_TEMPLATE_TEMPLATE_PARM:
   indent_to (file, indent + 3);
   fprintf (file, "index %d level %d orig_level %d",
   TEMPLATE_TYPE_IDX (node), TEMPLATE_TYPE_LEVEL (node),


[Patch ,gcc/MIPS] add an build-time/runtime option to disable madd.fmt

2016-12-21 Thread Yunqiang Su
By this patch, I add a build-time option ` --with-unfused-madd4=yes/no’,
and runtime option -m(no-)unfused-madd4,
to disable generate madd.fmt instructions.

These 2 options is needed due to madd.fmt/msub.fmt on Loongson are broken,
which may generate wrong calculator result.

diff --git a/src/gcc/config.gcc b/src/gcc/config.gcc
index 1b7da0e..3a30b44 100644
--- a/src/gcc/config.gcc
+++ b/src/gcc/config.gcc
@@ -3991,7 +3991,7 @@ case "${target}" in
;;
 
mips*-*-*)
-   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci"
+   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 unfused_madd4 tune tune_32 tune_64 divide llsc mips-plt 
synci"
 
case ${with_float} in
"" | soft | hard)
@@ -4048,6 +4048,19 @@ case "${target}" in
exit 1
;;
esac
+   
+   case ${with_unfused_madd4} in
+   "" | yes)
+   with_unfused_madd4="unfused-madd4"
+   ;;
+   no)
+   with_unfused_madd4="no-unfused-madd4"
+   ;;
+   *)
+   echo "Unknown unfused_madd4 type used in 
--with-unfused-madd4=$with_unfused_madd4" 1>&2
+   exit 1
+   ;;
+   esac
 
case ${with_abi} in
"" | 32 | o64 | n32 | 64 | eabi)
@@ -4547,7 +4560,7 @@ case ${target} in
 esac
 
 t=
-all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 
schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls"
+all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 
schedule float mode fpu nan fp_32 odd_spreg_32 unfused_madd4 divide llsc 
mips-plt synci tls"
 for option in $all_defaults
 do
eval "val=\$with_"`echo $option | sed s/-/_/g`
diff --git a/src/gcc/config/mips/mips.c b/src/gcc/config/mips/mips.c
index 5af3d1e..236ab94 100644
--- a/src/gcc/config/mips/mips.c
+++ b/src/gcc/config/mips/mips.c
@@ -17992,6 +17992,16 @@ mips_option_override (void)
  will be produced.  */
   target_flags |= MASK_ODD_SPREG;
 }
+  
+  /* If neither -munfused-madd nor -mno-unfused-madd was given on the command
+ line, set MASK_UNFSUED_MADD based on the ISA.  */
+  if ((target_flags_explicit & MASK_UNFUSED_MADD4) == 0)
+{
+  if (!ISA_HAS_UNFUSED_MADD4)
+   target_flags &= ~MASK_UNFUSED_MADD4;
+  else
+   target_flags |= MASK_UNFUSED_MADD4;
+}
 
   if (!ISA_HAS_COMPACT_BRANCHES && mips_cb == MIPS_CB_ALWAYS)
 {
diff --git a/src/gcc/config/mips/mips.h b/src/gcc/config/mips/mips.h
index 763ca58..8c7d24e 100644
--- a/src/gcc/config/mips/mips.h
+++ b/src/gcc/config/mips/mips.h
@@ -892,6 +892,7 @@ struct mips_cpu_info {
":%{!msoft-float:%{!msingle-float:%{!mfp*:-mfp%(VALUE)" }, \
   {"odd_spreg_32", "%{" OPT_ARCH32 ":%{!msoft-float:%{!msingle-float:" \
   "%{!modd-spreg:%{!mno-odd-spreg:-m%(VALUE)}" }, \
+  {"unfused_madd4", "%{!munfused-madd4:%{!mno-unfused-madd4:-m%(VALUE)}}" }, \
   {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}" }, \
   {"llsc", "%{!mllsc:%{!mno-llsc:-m%(VALUE)}}" }, \
   {"mips-plt", "%{!mplt:%{!mno-plt:-m%(VALUE)}}" }, \
@@ -1089,7 +1090,7 @@ struct mips_cpu_info {
 
 /* ISA has 4 operand unfused madd instructions of the form
'd = [+-] (a * b [+-] c)'.  */
-#define ISA_HAS_UNFUSED_MADD4  (ISA_HAS_FP4 && !TARGET_MIPS8000)
+#define ISA_HAS_UNFUSED_MADD4  (ISA_HAS_FP4 && !TARGET_MIPS8000 && 
TARGET_UNFUSED_MADD4)
 
 /* ISA has 3 operand r6 fused madd instructions of the form
'c = c [+-] (a * b)'.  */
diff --git a/src/gcc/config/mips/mips.opt b/src/gcc/config/mips/mips.opt
index ebd67e4..a8c23f6 100644
--- a/src/gcc/config/mips/mips.opt
+++ b/src/gcc/config/mips/mips.opt
@@ -412,6 +412,10 @@ modd-spreg
 Target Report Mask(ODD_SPREG)
 Enable use of odd-numbered single-precision registers.
 
+munfused-madd4
+Target Report Mask(UNFUSED_MADD4)
+Enable unfused multiply-add/multiply-sub instruction, aka madd.fmt/msub.fmt.
+
 mframe-header-opt
 Target Report Var(flag_frame_header_optimization) Optimization
 Optimize frame header.



Re: [C++ PATCH] Reject out of bounds constexpr stores (PR c++/77830)

2016-12-21 Thread Jason Merrill

OK.


Re: [PATCH v4] add -fprolog-pad=N,M option

2016-12-21 Thread Sandra Loosemore

On 12/21/2016 10:23 AM, Torsten Duwe wrote:


diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1f303bc..a09851a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3076,6 +3076,17 @@ that affect more than one function.
  This attribute should be used for debugging purposes only.  It is not
  suitable in production code.

+@item prolog_pad
+@cindex @code{prolog_pad} function attribute
+@cindex function entry padded with NOPs
+In case the target's text segment can be made writable at run time
+by any means, padding the function entry with a number of NOPs can
+be used to provide a universal tool for instrumentation.  Usually,
+prolog padding is enabled globally using the -fprolog-pad= command
+line switch, and disabled by the attribute keyword for functions


@option{-fprolog-pad=} command-line switch


+that are part of the actual instrumentation framework, to easily avoid
+an endless recursion.


I'm confused.  Does the prolog_pad attribute *disable* the padding, 
then?  You should say that explicitly.



  @item pure
  @cindex @code{pure} function attribute
  @cindex functions that have no side effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b729964..21e5067 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11309,6 +11309,21 @@ of the function name, it is considered to be a match.  
For C99 and C++
  extended identifiers, the function name must be given in UTF-8, not
  using universal character names.

+@item -fprolog-pad=N


@var{N}


+@opindex fprolog-pad
+Generate a pad of N nops right at the beginning


@var{N} again.

Can we be consistent about whether it's "NOP" or "nop"?  You used "NOP" 
in the doc snippet above, and that seems to be the preferred usage.



+of each function, which can be used to patch in any desired


What if the target supports NOPs of different sizes?  (E.g., nios2 with 
-march=r2 -mcdx).


Where, exactly, is the padding inserted?  "At the beginning of each 
function" might correspond to the address of the first instruction of 
the function, or it might correspond to the address of some instruction 
after the prolog where GDB would set a breakpoint on function entry. 
This feature isn't usable without more explicit documentation about what 
it does.



+instrumentation at run time, provided that the code segment
+is writeable.  For run time identification, the starting addresses


s/writeable/writable/
s/run time identification/run-time identification/


+of these pads, which correspond to their respective functions,
+are additionally collected in the @code{__prolog_pads_loc} section
+of the resulting binary.
+
+Note that value of @code{__attribute__ ((prolog_pad (N)))} takes


You didn't explain this syntax in the documentation of the attribute 
above...



+precedence over command-line option -fprolog_pad=N.  This can be used


@option and @var markup on the option, please.  And if N here is 
different than N above, don't re-use the same metasyntactic variable for it.



+to increase the pad size or to remove the pad completely on a single
+function.
+
  @end table


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index cdf5f48..65c265c 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4564,6 +4564,10 @@ will select the smallest suitable mode.
  This section describes the macros that output function entry
  (@dfn{prologue}) and exit (@dfn{epilogue}) code.

+@deftypefn {Target Hook} void TARGET_ASM_PRINT_PROLOG_PAD (FILE *@var{file}, 
unsigned HOST_WIDE_INT @var{pad_size}, bool @var{record_p})
+Generate prologue pad


Needs more extensive documentation, and sentences should end in a period.

-Sandra



Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

2016-12-21 Thread Aldy Hernandez

On 12/20/2016 09:16 AM, Richard Biener wrote:


You do not handle memory or calls conservatively which means the existing
testcase only needs some obfuscation to become a problem again.  To fix
that before /* Check that any SSA names used to define NAME is also fully
defined.  */ bail out conservatively, like

   if (! is_gimple_assign (def)
  || gimple_assign_single_p (def))
return true;


I understand the !is_gimple_assign, which will skip over GIMPLE_CALLs 
setting a value, but the "gimple_assign_single_p(def) == true"??


Won't this last one bail on things like e.3_7 = e, or x_77 = y_88? 
Don't we want to follow up the def chain precisely on these?


Thanks.
Aldy


C++ PATCH for c++/78749, friend in anonymous namespace

2016-12-21 Thread Jason Merrill
We shouldn't complain about friend temploids that we just haven't
instantiated yet.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit a5742e86f3f8e1a2877b1dcf49403aaa8e7d0b86
Author: Jason Merrill 
Date:   Mon Dec 19 16:22:25 2016 -0500

PR c++/78749 - friend in anonymous namespace

* decl.c (wrapup_globals_for_namespace): Don't complain about friend
pseudo-template instantiations.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index e83b542..2954160 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -926,6 +926,7 @@ wrapup_globals_for_namespace (tree name_space, void* data 
ATTRIBUTE_UNUSED)
&& DECL_EXTERNAL (decl)
&& !TREE_PUBLIC (decl)
&& !DECL_ARTIFICIAL (decl)
+   && !DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION (decl)
&& !TREE_NO_WARNING (decl))
  {
warning_at (DECL_SOURCE_LOCATION (decl),
diff --git a/gcc/testsuite/g++.dg/warn/Wunused-function3.C 
b/gcc/testsuite/g++.dg/warn/Wunused-function3.C
new file mode 100644
index 000..94c9025
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wunused-function3.C
@@ -0,0 +1,11 @@
+// { dg-options -Wunused-function }
+
+namespace
+{
+  template  struct A
+  {
+friend void f(A) { }
+  };
+
+  A a;
+}


Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE

2016-12-21 Thread Jeff Law

On 12/16/2016 06:57 AM, Richard Biener wrote:


Apart from what Trevor says about using sbitmaps (try to avoid the initial
zeroing please) and the missed freeing (you can use auto_[s]bitmap?)
some comments below.

New version uses sbitmaps and avoids zero-ing when we can.


+static void
+trim_complex_store (bitmap orig, bitmap live, gimple *stmt)
+{
+  bitmap dead = BITMAP_ALLOC (NULL);
+  bitmap_and_compl (dead, orig, live);
+
+  /* So we have to verify that either the real or imag part as a whole
+ is dead.  They are always the same size.  Thus for one to be dead
+ the number of live bytes would have to be the same as the number of
+ dead bytes.  */
+  if (bitmap_count_bits (live) == bitmap_count_bits (dead))


popcount is expensive, so is this really a short-cut?
Note this can all be done more efficiently.  Using sbitmaps forces us to 
normalize the offset/size and I've dropped the original bitmap as the 
ao_ref carries the data we need.  Those two changes make other 
simplifications fall-out and what we get in compute_trims a last_set_bit 
and first_set_bit on the live bitmap -- no other bitmap 
traversals/searches/counts.


That makes compute_trims more efficient than the code in 
trim_complex_store.  So I've made trim_complex_store re-use the more 
general trimming code, and it's still agnostic of the underlying sizes, 
it just cares that the underlying components are the same size.




The actual transform in dse_possible_dead_store_p looks a bit misplaced.
I see it's somehow convenient but then maybe returning a enum from this
function might be cleaner.  Well, I'm not too torn about this, so maybe
just rename the function a bit (no good suggestion either).

The rest of the patch (the infrastructure) looks reasonable.
I think I mentioned it earlier, but moving towards a single allocated 
bitmap for the entire pass makes it much easier to have 
dse_possible_dead_store_p to return a tri-state and live information. 
You'll see that change in the upcoming revision to the patchkit.


Jeff




Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE

2016-12-21 Thread Jeff Law

On 12/16/2016 12:29 AM, Trevor Saunders wrote:

On Thu, Dec 15, 2016 at 06:54:43PM -0700, Jeff Law wrote:

   unsigned cnt = 0;
+  bitmap live_bytes = NULL;
+  bitmap orig_live_bytes = NULL;

   *use_stmt = NULL;

+  /* REF is a memory write.  Go ahead and get its base, size, extent
+ information and encode the bytes written into LIVE_BYTES.  We can
+ handle any case where we have a known base and maximum size.
+
+ However, experimentation has shown that bit level tracking is not
+ useful in practice, so we only track at the byte level.
+
+ Furthermore, experimentation has shown that 99% of the cases
+ that require byte tracking are 64 bytes or less.  */
+  if (valid_ao_ref_for_dse (ref)
+  && (ref->max_size / BITS_PER_UNIT
+ <= PARAM_VALUE (PARAM_DSE_MAX_OBJECT_SIZE)))
+{
+  live_bytes = BITMAP_ALLOC (NULL);
+  orig_live_bytes = BITMAP_ALLOC (NULL);
+  bitmap_set_range (live_bytes,
+   ref->offset / BITS_PER_UNIT,
+   ref->max_size / BITS_PER_UNIT);
+  bitmap_copy (orig_live_bytes, live_bytes);


would it maybe make sense to use sbitmaps since the length is known to
be short, and doesn't change after allocation?

New version will use auto_sbitmap and sbitmaps.

Jeff


Re: [RFC] [P2] [PR tree-optimization/33562] Lowering more complex assignments.

2016-12-21 Thread Jeff Law

On 02/18/2016 02:56 AM, Richard Biener wrote:

On Wed, Feb 17, 2016 at 5:10 PM, Jeff Law  wrote:

On 02/17/2016 07:13 AM, Richard Biener wrote:


-  /* Continue walking until we reach a kill.  */
-  while (!stmt_kills_ref_p (temp, ref));
+  /* Continue walking until we reach a full kill as a single statement
+ or there are no more live bytes.  */
+  while (!stmt_kills_ref_p (temp, ref)
+&& !(live_bytes && bitmap_empty_p (live_bytes)));



Just a short quick comment - the above means you only handle partial
stores
with no interveaning uses.  You don't handle, say

struct S { struct R { int x; int y; } r; int z; } s;

  s = { {1, 2}, 3 };
  s.r.x = 1;
  s.r.y = 2;
  struct R r = s.r;
  s.z = 3;

where s = { {1, 2}, 3} is still dead.


Right.  But handling that has never been part of DSE's design goals. Once
there's a use, DSE has always given up.


Yeah, which is why I in the end said we need a "better" DSE ...
And coming back to this -- these kind of opportunities appear to be 
rare.  I found a couple in a GCC build and some in the libstdc++ testsuite.


From looking at how your test is currently handled, the combination of 
SRA and early FRE tend to clean things up before DSE gets a chance. 
That may account for the lack of "hits" for this improvement.


Regardless, the code is written (#4 in the recently posted series).  I'm 
going to add your test to the updated path (with SRA and FRE disabled 
obviously) as an additional test given there's very little coverage of 
this feature outside the libstdc++ testsuite.





Yeah, I think the case we're after and that happens most is sth like

 a = { aggregate init };
 a.a = ...;
 a.b = ...;
 ...

and what you add is the ability to remove the aggregate init completely.

What would be nice to have is to remove it partly as well, as for

struct { int i; int j; int k; } a = {};
a.i = 1;
a.k = 3;

we'd like to remove the whole-a zeroing but we need to keep zeroing
of a.j.

I believe that SRA already has most of the analysis part, what it is
lacking is that SRA works not flow-sensitive (it just gathers
function-wide data) and that it doesn't consider base objects that
have their address taken or are pointer-based.
So that's handled by patch #2 and it's (by far) the most effective part 
of this work in terms of hits and reducing the number of stored bytes. 
Patch #2 has a few tests for this case and it is well exercised by a 
bootstrap as well, I don't think your testcase provides any additional 
coverage.


Jeff



Re: [PATCH, v2, rs6000] pr65479 Add -fasynchronous-unwind-tables when the -fsanitize=address option is seen

2016-12-21 Thread Segher Boessenkool
On Wed, Dec 21, 2016 at 09:29:26AM -0600, Bill Seurer wrote:
> [PATCH, v2, rs6000] pr65479 Add -fasynchronous-unwind-tables when the 
> -fsanitize=address option is seen.
> 
> All feedback from the earlier version has been taken into account now.
> 
> This patch adds the -fasynchronous-unwind-tables option to compilations when
> the -fsanitize=address option is seen but not if any
> -fasynchronous-unwind-tables options were already specified.
> -fasynchronous-unwind-tables causes a full strack trace to be produced when
> the sanitizer detects an error.  Without the full trace several of the asan
> test cases fail on powerpc.
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479 for more information.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu,
> powerpc64be-unknown-linux-gnu, and x86_64-pc-linux-gnu with no regressions.
> Is this ok for trunk?

That looks fine.  One super-minor comment.  Thanks!

Segher


> 2016-12-21  Bill Seurer  
> 
>   PR sanitizer/65479
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Add 

Trailing space here.

>   -fasynchronous-unwind-tables option when -fsanitize=address is
>   specified.
> 
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c(revision 243830)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -3858,6 +3858,13 @@ rs6000_option_override_internal (bool global_init_
>&& !global_options_set.x_flag_ira_loop_pressure)
>  flag_ira_loop_pressure = 1;
>  
> +  /* -fsanitize=address needs to turn on -fasynchronous-unwind-tables in 
> order
> + for tracebacks to be complete but not if any 
> -fasynchronous-unwind-tables
> + options were already specified.  */
> +  if (flag_sanitize & SANITIZE_USER_ADDRESS
> +  && !global_options_set.x_flag_asynchronous_unwind_tables)
> +flag_asynchronous_unwind_tables = 1;
> +
>/* Set the pointer size.  */
>if (TARGET_64BIT)
>  {


Re: [PATCH,rs6000] Fix PR11488 for rs6000 target

2016-12-21 Thread Segher Boessenkool
On Tue, Dec 20, 2016 at 11:27:18AM -0600, Pat Haugen wrote:
> This patch attempts to fix problems with the first scheduling pass creating 
> too much register pressure. It does this by enabling the target hook to 
> compute the pressure classes for rs6000 target since the first thing I 
> observed while investigating the testcase in the subject PR is that IRA was 
> picking NON_SPECIAL_REGS as a pressure class which led to the sched-pressure 
> code computing too high of a value for number of regs available for pseudos 
> preferring GENERAL_REGS. It also enables -fsched-pressure by default, using 
> the 'model' algorithm.
> 
> I ran various runs of cpu20006 to determine the set of pressure classes and 
> which sched-pressure algorithm to use. Net result is that with these patches 
> I see 6 benchmarks improve in the 2.4-6% range but there are also a couple 2% 
> degradations which will need follow up in GCC 8. There was also one benchmark 
> that showed a much bigger improvement with the 'weighted' sched-pressure 
> algorithm that also needs follow up ('weighted' was not chosen as default 
> since it showed more degradations).
> 
> Bootstrap/regtest on powerpc64/powerpc64le. There were 2 testcases that 
> failed (sms-3.c/sms-6.c) but I have submitted a separate patch to fix those. 
> Ok for trunk?

Okay.  Thanks!


Segher


> 2016-12-20  Pat Haugen  
> 
>   PR rtl-optimization/11488
>   * common/config/rs6000/rs6000-common.c
>   (rs6000_option_optimization_table): Enable -fsched-pressure.
>   * config/rs6000/rs6000.c (TARGET_COMPUTE_PRESSURE_CLASSES): Define
>   target hook.
>   (rs6000_option_override_internal): Set default -fsched-pressure 
> algorithm.
>   (rs6000_compute_pressure_classes): Implement target hook.


Re: Pointer Bounds Checker and trailing arrays (PR68270)

2016-12-21 Thread Ilya Enkovich
2016-12-20 17:44 GMT+03:00 Alexander Ivchenko :
> 2016-11-26 0:28 GMT+03:00 Ilya Enkovich :
>> 2016-11-25 15:47 GMT+03:00 Alexander Ivchenko :
>>> Hi,
>>>
>>> The patch below addresses PR68270. could you please take a look?
>>>
>>> 2016-11-25  Alexander Ivchenko  
>>>
>>>* c-family/c.opt (flag_chkp_flexible_struct_trailing_arrays):
>>>Add new option.
>>>* tree-chkp.c (chkp_parse_array_and_component_ref): Forbid
>>>narrowing when chkp_parse_array_and_component_ref is used and
>>>the ARRAY_REF points to an array in the end of the struct.
>>>
>>>
>>>
>>> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
>>> index 7d8a726..e45d6a2 100644
>>> --- a/gcc/c-family/c.opt
>>> +++ b/gcc/c-family/c.opt
>>> @@ -1166,6 +1166,11 @@ C ObjC C++ ObjC++ LTO RejectNegative Report
>>> Var(flag_chkp_narrow_to_innermost_ar
>>>  Forces Pointer Bounds Checker to use bounds of the innermost arrays in 
>>> case of
>>>  nested static arryas access.  By default outermost array is used.
>>>
>>> +fchkp-flexible-struct-trailing-arrays
>>> +C ObjC C++ ObjC++ LTO RejectNegative Report
>>> Var(flag_chkp_flexible_struct_trailing_arrays)
>>> +Allow Pointer Bounds Checker to treat all trailing arrays in structures as
>>> +possibly flexible.
>>
>> Words 'allow' and 'possibly' are confusing here. This option is about to 
>> force
>> checker to do something, not to give him a choice.
>
> Fixed
>
>> New option has to be documented in invoke.texi. It would also be nice to 
>> reflect
>> changes on GCC MPX wiki page.
>
> Done
>> We have an attribute to change compiler behavior when this option is not set.
>> But we have no way to make exceptions when this option is used. Should we
>> add one?
> Something like "bnd_fixed_size" ? Could work. Although the customer
> request did not mention the need for that.
> Can I add it in a separate patch?
>

Yes.

>
>>> +
>>>  fchkp-optimize
>>>  C ObjC C++ ObjC++ LTO Report Var(flag_chkp_optimize) Init(-1)
>>>  Allow Pointer Bounds Checker optimizations.  By default allowed
>>> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
>>> index 2769682..40f99c3 100644
>>> --- a/gcc/tree-chkp.c
>>> +++ b/gcc/tree-chkp.c
>>> @@ -3425,7 +3425,9 @@ chkp_parse_array_and_component_ref (tree node, tree 
>>> *ptr,
>>>if (flag_chkp_narrow_bounds
>>>&& !flag_chkp_narrow_to_innermost_arrray
>>>&& (!last_comp
>>> -  || chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1
>>> +  || (chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1))
>>> +  && !(flag_chkp_flexible_struct_trailing_arrays
>>> +   && array_at_struct_end_p (var)
>>
>> This is incorrect place for fix. Consider code
>>
>> struct S {
>>   int a;
>>   int b[10];
>> };
>>
>> struct S s;
>> int *p = s.b;
>>
>> Here you need to compute bounds for p and you want your option to take effect
>> but in this case you won't event reach your new check because there is no
>> ARRAY_REF. And even if we change it to
>>
>> int *p = s.b[5];
>>
>> then it still would be narrowed because s.b would still be written
>> into 'comp_to_narrow'
>> variable. Correct place for fix is in chkp_may_narrow_to_field.
>
> Done
>
>> Also you should consider fchkp-narrow-to-innermost-arrray option. Should it
>> be more powerfull or not? I think fchkp-narrow-to-innermost-arrray shouldn't
>> narrow to variable sized fields. BTW looks like right now bnd_variable_size
>> attribute is ignored by fchkp-narrow-to-innermost-arrray. This is another
>> problem and may be fixed in another patch though.
> The way code works in chkp_parse_array_and_component_ref seems to be
> exactly like you say:  fchkp-narrow-to-innermost-arrray won't narrow
> to variable sized fields. I will create a separate bug for
> bnd_variable_size+ fchkp-narrow-to-innermost-arrray.
>
>> Also patch lacks tests for various situations (with option and without, with
>> ARRAY_REF and without). In case of new attribute and fix for
>> fchkp-narrow-to-innermost-arrray behavior additional tests are required.
> I added three tests for flag_chkp_flexible_struct_trailing_arrays
>
>
>
> The patch is below:
>
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 2d47d54..21ad6aa 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1188,7 +1188,13 @@ narrowing is on, field bounds are used.
> Otherwise full object bounds are used.
>  fchkp-narrow-to-innermost-array
>  C ObjC C++ ObjC++ LTO RejectNegative Report
> Var(flag_chkp_narrow_to_innermost_arrray)
>  Forces Pointer Bounds Checker to use bounds of the innermost arrays in case 
> of
> -nested static arryas access.  By default outermost array is used.
> +nested static arrays access.  By default outermost array is used.
> +
> +fchkp-flexible-struct-trailing-arrays
> +C ObjC C++ ObjC++ LTO RejectNegative Report
> Var(flag_chkp_flexible_struct_trailing_arrays)
> +Forces Pointer Bounds Checker to treat all trailing arrays in 

Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

2016-12-21 Thread Jeff Law

On 12/20/2016 10:33 AM, Richard Biener wrote:


but for loops we can just continue and ignore this use.  And

bitmap_set_bit

returns whether it set a bit, thus

if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION

(name)))

  worklist.safe_push (name);

should work?

What if we're in an irreducible region?


Handling all back edges by deferring to their defs should work.  At least I 
can't see how it would not.

I'll take your word for it -- I haven't thought deeply about this.









In principle the thing is sound but I'd like to be able to pass in a

bitmap of

known maybe-undefined/must-defined SSA names to have a cache for
multiple invocations from within a pass (like this unswitching case).

So that means keeping another bitmap for things positively identified
as
defined, then saving it for later invocations.


We eventually need a tristate here for maximum caching.  And as the result 
depends on the dominating BB of the postdom region the savings might be 
questionable.

Possibly.

Jeff


Re: [PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.

2016-12-21 Thread Segher Boessenkool
On Wed, Dec 21, 2016 at 01:58:18PM +0100, Georg-Johann Lay wrote:
> $ avr-gcc 
> /gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c -S 
> -O1 -mmcu=avr4 -S -v
> 
> /gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c: In 
> function 'yasm_lc3b__parse_insn':
> /gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1: 
> error: insn does not satisfy its constraints:
>  }
>  ^
> (jump_insn 58 98 59 9 (set (pc)
> (if_then_else (eq (and:HI (reg:HI 31 r31)
> (const_int 1 [0x1]))
> (const_int 0 [0]))
> (label_ref 70)
> (pc))) 
> "/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11 
> 415 {*sbrx_and_branchhi}
>  (int_list:REG_BR_PROB 375 (nil))
>  -> 70)


> Combine comes up with the following insn:
> (jump_insn 58 57 59 7 (set (pc)
> (if_then_else (eq (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [ 
> operands ]) [1 *operands_17(D)+0 S1 A8]) 0)
> (const_int 1 [0x1]))
> (const_int 0 [0]))
> (label_ref 70)
> (pc))) 
> "/home/georg/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
>  
> 415 {*sbrx_and_branchhi}
>  (int_list:REG_BR_PROB 375 (nil))
>  -> 70)
> 
> which cannot be correct because avr.md::*sbrx_and_branchhi reads:
> 
> (define_insn "*sbrx_and_branch"
>   [(set (pc)
> (if_then_else
>  (match_operator 0 "eqne_operator"
>  [(and:QISI
>(match_operand:QISI 1 "register_operand" "r")
>(match_operand:QISI 2 "single_one_operand" "n"))
>   (const_int 0)])
>  (label_ref (match_operand 3 "" ""))
>  (pc)))]
>   "" { ... } ...)
> 
> Hence we have a memory operand (subreg of mem)) where only a register is 
> allowed.  Reg alloc then reloads the mem:QI into R31, but R31 is the
> last hard reg, i.e. R31 cannot hold HImode.

If you don't have instruction scheduling subregs of mem are allowed (and
are counted as registers).  Combine asks recog, and it think this is fine.

Why does reload use r31 here?  Why does it think that is valid for HImode?


Segher


Re: [PR c++/78572] handle array self references in intializers

2016-12-21 Thread Jakub Jelinek
On Wed, Dec 21, 2016 at 12:29:37PM -0500, Aldy Hernandez wrote:
> > > int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };
> > > 
> > > (gdb) x/4x 
> > > 0x601040 :   0x0005  0x0111  0x0222
> > > 0x0005
> > > 
> > > That is, the array[3]=5 overwrites the last 0x333.  I would expect
> > > that...
> > 
> > That may be wrong. Using the 4431 draft, [8.5.4]/4 says 'Within the
> > initializer-list of a braced-init-list, the initializer-clauses,
> > including any that result from pack expansions (14.5.3), are evaluated
> > in the order in which they appear.'
> > It goes on to mention that there are sequence points, plus that the
> > ordering holds regardless of the semantics of initialization.
> > 
> > So by my reading, the effects of the above list are:
> > a) assign 5 to array[3]
> > b) assign 5 to array[0] -- consume the first initializer
> > c) assign x111 to array[1] -- consume the second initializer
> > d) assign 0x222 to array[2] -- consume the third initializer
> > e) assign 0x333 to array[3] -- consume the fourth intializer,
> >overwrite #a
> 
> H, I see.
> 
> Since designated initializers in arrays are not supported in the C++ FE,
> should we sorry() on array initializers using self-reference like above the
> above?  Surely it's better than ICEing.  Or, would you prefer we implement
> the above semantics in the draft for GCC7?

The side-effects are unrelated to designed initializers, only [3] = 5
is designed initializer, array[3] = 5 is not, it is just arbitrary
expression evaluated at that point to compute the corresponding element.
We do support designed initializers in C++, but only in a useless way
(require that the designators appear in order without any gaps).

Jakub


Re: [PR c++/78572] handle array self references in intializers

2016-12-21 Thread Aldy Hernandez

On 12/20/2016 03:14 PM, Nathan Sidwell wrote:

On 12/20/2016 01:52 PM, Aldy Hernandez wrote:


int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };

(gdb) x/4x 
0x601040 :   0x0005  0x0111  0x0222
0x0005

That is, the array[3]=5 overwrites the last 0x333.  I would expect
that...


That may be wrong. Using the 4431 draft, [8.5.4]/4 says 'Within the
initializer-list of a braced-init-list, the initializer-clauses,
including any that result from pack expansions (14.5.3), are evaluated
in the order in which they appear.'
It goes on to mention that there are sequence points, plus that the
ordering holds regardless of the semantics of initialization.

So by my reading, the effects of the above list are:
a) assign 5 to array[3]
b) assign 5 to array[0] -- consume the first initializer
c) assign x111 to array[1] -- consume the second initializer
d) assign 0x222 to array[2] -- consume the third initializer
e) assign 0x333 to array[3] -- consume the fourth intializer,
   overwrite #a


H, I see.

Since designated initializers in arrays are not supported in the C++ FE, 
should we sorry() on array initializers using self-reference like above 
the above?  Surely it's better than ICEing.  Or, would you prefer we 
implement the above semantics in the draft for GCC7?


Aldy


[PATCH v4] add -fprolog-pad=N,M option

2016-12-21 Thread Torsten Duwe
On Tue, Dec 20, 2016 at 04:34:02PM +0300, Maxim Kuvyrkov wrote:
> 
> Hi Bernd, thanks for reviewing this!
> 
> Regarding the usefulness of this feature, it has been discussed here (2 years 
> ago): 
> http://gcc.gcc.gnu.narkive.com/JfWUDn8Y/rfc-kernel-livepatching-support-in-gcc
>  .
> 
> Kernel live-patching is of interest to several major distros, and it already 
> supported by GCC via architecture-specific implementations (x86, s390, 
> sparc).  The -fprolog-pad= option is an architecture-neutral equivalent that 
> can be used for porting kernel live-patching to new architectures.
> 
> Existing support for kernel live-patching in x86, s390 and sparc backends can 
> be redirected to use -fprolog-pad= functionality and, thus, simplified.

Yes, indeed.

Hopefully I caught all the style problems, plus the functional change
that I no longer fiddle with the attributes by hand. I was really hoping
for more feedback on the functionality...

To be done: your default_print_prolog_pad really nicely handles >90% of
all cases, but what if a) the target platform does not allow arbitrary
section names or b) it has very distinct ideas about code sizes and offsets
around the prologue? (All target CPUs have an insn called "nop" which
does what we expect, AFAICS ;-) gcc should error out somehow meaningfully,
IMHO.

Torsten

2016-12-21  Torsten Duwe : 

 * c-family/c-attribs.c : introduce prolog_pad attribute and create
   a handler for it.

 * lto/lto-lang.c : Likewise.

 * common.opt : introduce -fprolog_pad command line option
   and its variables prolog_nop_pad_size and prolog_nop_pad_entry.

 * doc/extend.texi : document prolog_pad attribute.

 * doc/invoke.texi : document -fprolog_pad command line option.

 * opts.c (OPT_fprolog_pad_): add parser.

 * doc/tm.texi.in (TARGET_ASM_PRINT_PROLOG_PAD): new target hook

 * doc/tm.texi : Likewise.

 * target.def (print_prolog_pad): Likewise.

 * targhooks.h (default_print_prolog_pad): new function.

 * targhooks.c (default_print_prolog_pad): Likewise.

 * testsuite/c-c++-common/attribute-prolog_pad-1.c : New test.

 * toplev.c (process_options): Switch off IPA-RA if
   prolog pads are being generated.

 * varasm.c (assemble_start_function): look at prolog-pad command
   line switch and function attributes and maybe generate nop
   pads by calling print_prolog_pad.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index f5adade..21d0386 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -139,6 +139,7 @@ static tree handle_bnd_variable_size_attribute (tree *, 
tree, tree, int, bool *)
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
 static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
+static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *);
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -345,6 +346,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_bnd_instrument, false },
   { "fallthrough",   0, 0, false, false, false,
  handle_fallthrough_attribute, false },
+  { "prolog_pad",1, 2, true, false, false,
+ handle_prolog_pad_attribute, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -3173,3 +3176,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
int,
   *no_add_attrs = true;
   return NULL_TREE;
 }
+
+static tree
+handle_prolog_pad_attribute (tree *, tree, tree, int,
+bool *)
+{
+  return NULL_TREE;
+}
diff --git a/gcc/common.opt b/gcc/common.opt
index de06844..0317a86 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
 Variable
 int flag_debug_asm
 
+; If we should generate NOP pads before each function prologue
+Variable
+HOST_WIDE_INT prolog_nop_pad_size
+
+; And how far the asm entry point is into this pad
+Variable
+HOST_WIDE_INT prolog_nop_pad_entry
 
 ; Balance between GNAT encodings and standard DWARF to emit.
 Variable
@@ -2019,6 +2026,10 @@ fprofile-reorder-functions
 Common Report Var(flag_profile_reorder_functions)
 Enable function reordering that improves code placement.
 
+fprolog-pad=
+Common Joined Optimization
+Pad NOPs before each function prolog
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1f303bc..a09851a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3076,6 +3076,17 @@ that affect more than one function.
 This attribute should be used for debugging purposes only.  It is not
 suitable in production code.
 
+@item prolog_pad
+@cindex 

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-21 Thread Yuri Rumyantsev
Sorry,

I put wrong test - fix it here.

2016-12-21 13:12 GMT+03:00 Yuri Rumyantsev :
> Hi Richard,
>
> I occasionally found out a bug in my patch related to epilogue
> vectorization without masking : need to put label before
> initialization.
>
> Could you please review and integrate it to trunk. Test-case is also attached.
>
>
> Thanks ahead.
> Yuri.
>
> ChangeLog:
> 2016-12-21  Yuri Rumyantsev  
>
> * tree-vectorizer.c (vectorize_loops): Put label before initialization
> of loop_vectorized_call.
>
> gcc/testsuite/
>
> * gcc.dg/vect/vect-tail-nomask-2.c: New test.
>
> 2016-12-13 16:59 GMT+03:00 Richard Biener :
>> On Mon, 12 Dec 2016, Yuri Rumyantsev wrote:
>>
>>> Richard,
>>>
>>> Could you please review cost model patch before to include it to
>>> epilogue masking patch and add masking cost estimation as you
>>> requested.
>>
>> That's just the middle-end / target changes.  I was not 100% happy
>> with them but well, the vectorizer cost modeling needs work
>> (aka another rewrite).
>>
>> From below...
>>
>>> Thanks.
>>>
>>> Patch and ChangeLog are attached.
>>>
>>> 2016-12-12 15:47 GMT+03:00 Yuri Rumyantsev :
>>> > Hi Richard,
>>> >
>>> > You asked me about performance of spec2006 on AVX2 machine with new 
>>> > feature.
>>> >
>>> > I tried the following on Haswell using original patch designed by Ilya.
>>> > 1. Masking low trip count loops  only 6 benchmarks are affected and
>>> > performance is almost the same
>>> > 464.h264ref 63.900064. +0.15%
>>> > 416.gamess  42.900042.9000 +0%
>>> > 435.gromacs 32.800032.7000 -0.30%
>>> > 447.dealII  68.500068.3000 -0.29%
>>> > 453.povray  61.900062.1000 +0.32%
>>> > 454.calculix39.800039.8000 +0%
>>> > 465.tonto   29.900029.9000 +0%
>>> >
>>> > 2. epilogue vectorization without masking (use less vf) (3 benchmarks
>>> > are not affected)
>>> > 400.perlbench 47.200046.5000 -1.48%
>>> > 401.bzip2 29.900029.9000 +0%
>>> > 403.gcc   41.800041.6000 -0.47%
>>> > 456.hmmer 32.32. +0%
>>> > 462.libquantum81.500082. +0.61%
>>> > 464.h264ref   65.65.5000 +0.76%
>>> > 471.omnetpp   27.800028.2000 +1.43%
>>> > 473.astar 28.700028.6000 -0.34%
>>> > 483.xalancbmk 48.700048.6000 -0.20%
>>> > 410.bwaves95.300095.3000 +0%
>>> > 416.gamess42.900042.8000 -0.23%
>>> > 433.milc  38.800038.8000 +0%
>>> > 434.zeusmp51.700051.4000 -0.58%
>>> > 435.gromacs   32.800032.8000 +0%
>>> > 436.cactusADM 85.83. -2.35%
>>> > 437.leslie3d  55.500055.5000 +0%
>>> > 444.namd  31.300031.3000 +0%
>>> > 447.dealII68.700068.9000 +0.29%
>>> > 450.soplex47.300047.4000 +0.21%
>>> > 453.povray62.100061.4000 -1.12%
>>> > 454.calculix  39.700039.3000 -1.00%
>>> > 459.GemsFDTD  44.900045. +0.22%
>>> > 465.tonto 29.800029.8000 +0%
>>> > 481.wrf   51.51.2000 +0.39%
>>> > 482.sphinx3   69.800071.2000 +2.00%
>>
>> I see 471.omnetpp and 482.sphinx3 are in a similar ballpark and it
>> would be nice to catch the relevant case(s) with a cost model for
>> epilogue vectorization without masking first (to get rid of
>> --param vect-epilogues-nomask).
>>
>> As said elsewhere any non-conservative cost modeling (if the
>> number of scalar iterations is not statically constant) might
>> require versioning of the loop into a non-vectorized,
>> short-trip vectorized and regular vectorized case (the Intel
>> compiler does way more aggressive versioning IIRC).
>>
>> Richard.
>>
>>> > 3. epilogue vectorization using masking (4 benchmarks are not affected):
>>> > 400.perlbench 47.500046.8000 -1.47%
>>> > 401.bzip2 30.29.9000 -0.33%
>>> > 403.gcc   42.300042.3000 +0%
>>> > 445.gobmk 32.100032.8000 +2.18%
>>> > 456.hmmer 32.32. +0%
>>> > 458.sjeng 36.100035.5000 -1.66%
>>> > 462.libquantum81.100081.1000 +0%
>>> > 464.h264ref   65.400065. -0.61%
>>> > 483.xalancbmk 49.400049.3000 -0.20%
>>> > 410.bwaves95.900095.5000 -0.41%
>>> > 416.gamess42.800042.6000 -0.46%
>>> > 433.milc  38.800039.1000 +0.77%
>>> > 434.zeusmp52.100051.3000 -1.53%
>>> > 435.gromacs   32.900032.9000 +0%
>>> > 436.cactusADM 78.800085.3000 +8.24%
>>> > 437.leslie3d  55.400055.4000 +0%
>>> > 444.namd  31.300031.3000 +0%
>>> > 447.dealII69.69.2000 +0.28%
>>> > 450.soplex47.700047.6000 -0.20%
>>> > 453.povray62.200061.7000 -0.80%
>>> > 454.calculix  39.700038.2000 -3.77%
>>> > 459.GemsFDTD  44.900045. +0.22%
>>> > 465.tonto 29.800029.9000 +0.33%
>>> > 481.wrf   51.200051.6000 

[committed] Disallow explicit or implicit OpenMP mapping of assumed-size arrays (PR fortran/78866)

2016-12-21 Thread Jakub Jelinek
Hi!

At least in my reading of the standard OpenMP 4.[05] does not disallow
explicit or implicit mapping of assumed-size arrays, but it is IMNSHO a
defect in the standard, it is something that can't be really supported
because the compiler does not know the size of the assumed size array.
What works and is supported is explicit mapping of array section with
the upper bound specificied, then you give it the size through the array
section.  ifort also rejects it.  I've raised an OpenMP ticket for this.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk,
queued for 6.4.

2016-12-21  Jakub Jelinek  

PR fortran/78866
* openmp.c (resolve_omp_clauses): Diagnose assumed size arrays in
OpenMP map, to and from clauses.
* trans-openmp.c: Include diagnostic-core.h, temporarily redefining
GCC_DIAG_STYLE to __gcc_tdiag__.
(gfc_omp_finish_clause): Diagnose implicitly mapped assumed size
arrays.

* gfortran.dg/gomp/map-1.f90: Add expected error.
* gfortran.dg/gomp/pr78866-1.f90: New test.
* gfortran.dg/gomp/pr78866-2.f90: New test.

--- gcc/fortran/openmp.c.jj 2016-11-10 12:34:12.0 +0100
+++ gcc/fortran/openmp.c2016-12-21 10:36:25.535619047 +0100
@@ -4382,6 +4382,11 @@ resolve_omp_clauses (gfc_code *code, gfc
else
  resolve_oacc_data_clauses (n->sym, n->where, name);
  }
+   else if (list != OMP_CLAUSE_DEPEND
+&& n->sym->as
+&& n->sym->as->type == AS_ASSUMED_SIZE)
+ gfc_error ("Assumed size array %qs in %s clause at %L",
+n->sym->name, name, >where);
if (list == OMP_LIST_MAP && !openacc)
  switch (code->op)
{
--- gcc/fortran/trans-openmp.c.jj   2016-12-15 10:26:19.0 +0100
+++ gcc/fortran/trans-openmp.c  2016-12-21 13:11:34.131670305 +0100
@@ -38,6 +38,11 @@ along with GCC; see the file COPYING3.
 #include "gomp-constants.h"
 #include "omp-general.h"
 #include "omp-low.h"
+#undef GCC_DIAG_STYLE
+#define GCC_DIAG_STYLE __gcc_tdiag__
+#include "diagnostic-core.h"
+#undef GCC_DIAG_STYLE
+#define GCC_DIAG_STYLE __gcc_gfc__
 
 int ompws_flags;
 
@@ -1039,6 +1044,21 @@ gfc_omp_finish_clause (tree c, gimple_se
 return;
 
   tree decl = OMP_CLAUSE_DECL (c);
+
+  /* Assumed-size arrays can't be mapped implicitly, they have to be
+ mapped explicitly using array sections.  */
+  if (TREE_CODE (decl) == PARM_DECL
+  && GFC_ARRAY_TYPE_P (TREE_TYPE (decl))
+  && GFC_TYPE_ARRAY_AKIND (TREE_TYPE (decl)) == GFC_ARRAY_UNKNOWN
+  && GFC_TYPE_ARRAY_UBOUND (TREE_TYPE (decl),
+   GFC_TYPE_ARRAY_RANK (TREE_TYPE (decl)) - 1)
+== NULL)
+{
+  error_at (OMP_CLAUSE_LOCATION (c),
+   "implicit mapping of assumed size array %qD", decl);
+  return;
+}
+
   tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE;
   if (POINTER_TYPE_P (TREE_TYPE (decl)))
 {
--- gcc/testsuite/gfortran.dg/gomp/map-1.f90.jj 2015-01-15 23:39:06.0 
+0100
+++ gcc/testsuite/gfortran.dg/gomp/map-1.f902016-12-21 13:29:03.333952262 
+0100
@@ -70,7 +70,7 @@ subroutine test(aas)
   ! { dg-error "Rightmost upper bound of assumed size array section not 
specified" "" { target *-*-* } 68 }
   ! { dg-error "'aas' in MAP clause at \\\(1\\\) is not a proper array 
section" "" { target *-*-* } 68 }
 
-  !$omp target map(aas) ! { dg-error "The upper bound in the last dimension 
must appear" "" { xfail *-*-* } }
+  !$omp target map(aas) ! { dg-error "Assumed size array" }
   !$omp end target
 
   !$omp target map(aas(5:7))
--- gcc/testsuite/gfortran.dg/gomp/pr78866-1.f90.jj 2016-12-21 
11:16:06.202498432 +0100
+++ gcc/testsuite/gfortran.dg/gomp/pr78866-1.f902016-12-21 
11:12:14.0 +0100
@@ -0,0 +1,19 @@
+! PR fortran/78866
+! { dg-do compile }
+
+subroutine pr78866(x)
+  integer :: x(*)
+!$omp target map(x)! { dg-error "Assumed size array" }
+  x(1) = 1
+!$omp end target
+!$omp target data map(tofrom: x)   ! { dg-error "Assumed size array" }
+!$omp target update to(x)  ! { dg-error "Assumed size array" }
+!$omp target update from(x)! { dg-error "Assumed size array" }
+!$omp end target data
+!$omp target map(x(:23))   ! { dg-bogus "Assumed size array" }
+  x(1) = 1
+!$omp end target
+!$omp target map(x(:)) ! { dg-error "upper bound of assumed 
size array section" }
+  x(1) = 1 ! { dg-error "not a proper array 
section" "" { target *-*-* } .-1 }
+!$omp end target
+end
--- gcc/testsuite/gfortran.dg/gomp/pr78866-2.f90.jj 2016-12-21 
13:23:21.100447198 +0100
+++ gcc/testsuite/gfortran.dg/gomp/pr78866-2.f902016-12-21 
13:14:52.0 +0100
@@ -0,0 +1,9 @@
+! PR fortran/78866
+! { dg-do compile }
+

Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE

2016-12-21 Thread Jeff Law

On 12/21/2016 06:43 AM, Trevor Saunders wrote:

So a few interesting things have to be dealt if we want to make this change.
I already mentioned the need to bias based on ref->offset so that the range
of bytes we're tracking is represented 0..size.

While we know the length of the potential dead store we don't know the
length of the subsequent stores that we're hoping make the original a dead
store.  Thus when we start clearing LIVE_BYTES based on those subsequent
stores, we have to normalize those against the ref->offset of the original
store.

What's even more fun is sizing.  Those subsequent stores may be considerably
larger.  Which means that a bitmap_clear_range call has to be a hell of a
lot more careful when working with sbitmaps (we just happily stop all over
memory in that case) whereas a bitmap the right things will "just happen".

On a positive size since we've normalized the potential dead store's byte
range to 0..size, it means computing trims is easier because we inherently
know how many bits were originally set.  So compute_trims becomes trivial
and we can simplify trim_complex_store a bit as well.

And, of course, we don't have a bitmap_{clear,set}_range or a
bitmap_count_bits implementation for sbitmaps.


It's all a bit of "ugh", but should be manageable.


yeah, but that seems like enough work that you could reasonably stick
with bitmap instead.
Well, the conversion is done :-)  It was largely as I expected with the 
devil being in the normalization details which are well isolated.




p.s. sorry I've been falling behind lately.
It happens.  You might want to peek briefly at Aldy's auto_bitmap class 
as part of the 71691 patches.  It looks like a fairly straightforward 
conversion to auto_*.  I'm sure there's all kinds of places we could use it.


Jeff



[PATCH, v2, rs6000] pr65479 Add -fasynchronous-unwind-tables when the -fsanitize=address option is seen

2016-12-21 Thread Bill Seurer
[PATCH, v2, rs6000] pr65479 Add -fasynchronous-unwind-tables when the 
-fsanitize=address option is seen.

All feedback from the earlier version has been taken into account now.

This patch adds the -fasynchronous-unwind-tables option to compilations when
the -fsanitize=address option is seen but not if any
-fasynchronous-unwind-tables options were already specified.
-fasynchronous-unwind-tables causes a full strack trace to be produced when
the sanitizer detects an error.  Without the full trace several of the asan
test cases fail on powerpc.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479 for more information.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu,
powerpc64be-unknown-linux-gnu, and x86_64-pc-linux-gnu with no regressions.
Is this ok for trunk?

[gcc]

2016-12-21  Bill Seurer  

PR sanitizer/65479
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add 
-fasynchronous-unwind-tables option when -fsanitize=address is
specified.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 243830)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3858,6 +3858,13 @@ rs6000_option_override_internal (bool global_init_
   && !global_options_set.x_flag_ira_loop_pressure)
 flag_ira_loop_pressure = 1;
 
+  /* -fsanitize=address needs to turn on -fasynchronous-unwind-tables in order
+ for tracebacks to be complete but not if any -fasynchronous-unwind-tables
+ options were already specified.  */
+  if (flag_sanitize & SANITIZE_USER_ADDRESS
+  && !global_options_set.x_flag_asynchronous_unwind_tables)
+flag_asynchronous_unwind_tables = 1;
+
   /* Set the pointer size.  */
   if (TARGET_64BIT)
 {



Re: [PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.

2016-12-21 Thread Georg-Johann Lay

On 21.12.2016 15:42, Dominik Vogt wrote:

On Wed, Dec 21, 2016 at 01:58:18PM +0100, Georg-Johann Lay wrote:

On 12.12.2016 17:54, Segher Boessenkool wrote:

On Mon, Dec 12, 2016 at 05:46:02PM +0100, Dominik Vogt wrote:

Patch with these changes and a fix because of not handling
VOIDmode attached.  Bootstrapped and regression tested on s390 and
s390x.


Okay for trunk.

When did you see VOIDmode, btw?  It wasn't on a const_int I hope?


Segher



* combine.c (change_zero_ext): Handle mode expanding zero_extracts.


This was committes as r243578 and triggered (amongst other similar
test suite ICE failures):

$ avr-gcc
/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c
-S -O1 -mmcu=avr4 -S -v

/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:
In function 'yasm_lc3b__parse_insn':
/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1:
error: insn does not satisfy its constraints:
 }
 ^
(jump_insn 58 98 59 9 (set (pc)
(if_then_else (eq (and:HI (reg:HI 31 r31)
(const_int 1 [0x1]))
(const_int 0 [0]))
(label_ref 70)
(pc))) 
"/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
415 {*sbrx_and_branchhi}
 (int_list:REG_BR_PROB 375 (nil))
 -> 70)


Does that mean that, on avr, this is valid:

  (zero_extract:HI (reg:QI 31) ...)

but this is not:

  (subreg:HI (reg:QI 31))

?


What exactly do you mean by valid?  The second is just a paradoxical
subreg, dunno if a BE can disallow paradoxical subregs.

As the insn has a HImode register_operand, the subreg might be valid
before reload, the extract is certainly not a register_operand.

I don't know how exactly reload handles paradoxical subregs...
presumably reloading into a valid HImode register and applying
signedness as appropriate for the extension.

Re-quoting the insn...


which cannot be correct because avr.md::*sbrx_and_branchhi reads:

(define_insn "*sbrx_and_branch"
  [(set (pc)
(if_then_else
 (match_operator 0 "eqne_operator"
 [(and:QISI
   (match_operand:QISI 1 "register_operand" "r")
   (match_operand:QISI 2 "single_one_operand" "n"))
  (const_int 0)])
 (label_ref (match_operand 3 "" ""))
 (pc)))]
  "" { ... } ...)


...there is nothing involving (explicit) extracts or paradoxical regs.

FYI, the dump prior to .combine (.init-regs) reads as follows,
no extracts or paradoxical subregs around...


(note 54 53 55 7 [bb 7] NOTE_INSN_BASIC_BLOCK)
(insn 55 54 56 7 (set (reg:HI 79 [ *operands_17(D) ])
(mem:HI (reg/v/f:HI 75 [ operands ]) [1 *operands_17(D)+0 S2 
A8])) "pr26833.c":11 68 {*movhi}

 (nil))
(insn 56 55 57 7 (parallel [
(set (reg:HI 78)
(and:HI (reg:HI 79 [ *operands_17(D) ])
(const_int 1 [0x1])))
(clobber (scratch:QI))
]) "pr26833.c":11 251 {andhi3}
 (expr_list:REG_DEAD (reg:HI 79 [ *operands_17(D) ])
(nil)))
(insn 57 56 58 7 (parallel [
(set (cc0)
(compare (reg:HI 78)
(const_int 0 [0])))
(clobber (scratch:QI))
]) "pr26833.c":11 398 {cmphi3}
 (expr_list:REG_DEAD (reg:HI 78)
(nil)))
(jump_insn 58 57 59 7 (set (pc)
(if_then_else (eq (cc0)
(const_int 0 [0]))
(label_ref 70)
(pc))) "pr26833.c":11 418 {branch}
 (int_list:REG_BR_PROB 375 (nil))
 -> 70)

If combine wants to throw away unneeded portions of regs 78 / 79
and apply some kind of demotion, then it could use plain QImode,
at least in the present case where a similar QI version of the
insn is available.

Johann


Hence we have a memory operand (subreg of mem)) where only a
register is allowed.  Reg alloc then reloads the mem:QI into R31,
but R31 is the
last hard reg, i.e. R31 cannot hold HImode.

Johann


Ciao

Dominik ^_^  ^_^





Re: [PATCH, ARM] Further improve stack usage in sha512, part 2 (PR 77308)

2016-12-21 Thread Wilco Dijkstra
Bernd Edlinger wrote:
On 12/20/16 16:09, Wilco Dijkstra wrote:
> > As a result of your patches a few patterns are unused now. All the Thumb-2 
> > iordi_notdi*
> > patterns cannot be used anymore. Also I think arm_cmpdi_zero never gets 
> > used - a DI
>> mode compare with zero is always split into ORR during expand.
>
> I did not change anything for -mthumb -mfpu=neon for instance.
> Do you think that iordi_notdi* is never used also for that
> configuration?

With -mfpu=vfp or -msoft-float, these patterns cannot be used as logical 
operations are expanded before combine. Interestingly with -mfpu=neon ARM uses 
the orndi3_neon patterns (which are inefficient for ARM and probably should be 
disabled) but Thumb-2 uses the iordi_notdi patterns... So removing these 
reduces the number of patterns while we will still generate orn for Thumb-2.

> And if the arm_cmpdi_zero is never expanded, isn't it already
> unused before my patch?

It appears to be, so we don't need to fix it now. However when improving the 
expansion of comparisons it does trigger. For example x == 3 expands currently 
into 3 instructions:

cmp r1, #0
itt eq
cmpeq   r0, #3

Tweaking arm_select_cc_mode uses arm_cmpdi_zero, and when expanded early we 
generate this:

eor r0, r0, #3
orrsr0, r0, r1

Using sub rather than eor would be even better of course.

Wilco


Re: [PATCH] Do not suggest -fsanitize=all (PR driver/78863).

2016-12-21 Thread Martin Liška
On 12/21/2016 11:28 AM, Jakub Jelinek wrote:
> On Wed, Dec 21, 2016 at 11:20:33AM +0100, Martin Liška wrote:
>> I like your approach!
>> make check -k -j10 RUNTESTFLAGS="dg.exp=spellcheck-options-*" works fine.
>>
>> Am I install the patch after it survives proper regression tests?
> 
> Ok.
> 
> Also, only related, seems we have misspelling candidates for cases like
> -fsanitiz=ell
> but not for -fsanitize=ell
> (i.e. when the option is actually correct, just the argument to it (or part
> of it) is misspelled).  It would need to be done probably in
> parse_sanitizer_options when we diagnose it:
>   if (! found && complain)
> error_at (loc, "unrecognized argument to -fsanitize%s= option: %q.*s",
>   code == OPT_fsanitize_ ? "" : "-recover", (int) len, p);
> go through sanitizer_opts again in that case, add candidates (that are
> valid for the particular option), and if there is a hint, add the hint to
> this message.
> 
>   Jakub
> 

These look very similar to what I reported in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78877.
I've just added your case to the PR.

I'm going to install the patch.

M.


Re: [Ada] Fix PR ada/78845

2016-12-21 Thread Arnaud Charlet
> > Yes, please resend an updated patch.
> 
> The function Ada.Numerics.Generic_Real_Arrays.Inverse is required
> (ARM G.3.1(72)) to
> return a matrix with the bounds of the dimension indices swapped, i.e.
> result'Range(1) ==
> input'Range(2) and vice versa. The present code gets result'Range(1)
> correct, but
> result'Range(2) always starts at 1.
> 
> Of course, many users would always use ranges starting at 1 and wouldn't see a
> problem.
> 
> The same applies to Ada.Numerics.Complex_Real_Arrays.Inverse (ARM
> G.3.2(140)).

Updated patch OK as well.

> 2016-12-21  Simon Wright   >
> 
>   PR ada/78845
>   * a-ngcoar.adb (Inverse): call Unit_Matrix with First_1 set to
>   A'First(2)
> and vice versa.
>   * a-ngrear.adb (Inverse): likewise.


Re: [PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.

2016-12-21 Thread Dominik Vogt
On Wed, Dec 21, 2016 at 01:58:18PM +0100, Georg-Johann Lay wrote:
> On 12.12.2016 17:54, Segher Boessenkool wrote:
> >On Mon, Dec 12, 2016 at 05:46:02PM +0100, Dominik Vogt wrote:
> >>Patch with these changes and a fix because of not handling
> >>VOIDmode attached.  Bootstrapped and regression tested on s390 and
> >>s390x.
> >
> >Okay for trunk.
> >
> >When did you see VOIDmode, btw?  It wasn't on a const_int I hope?
> >
> >
> >Segher
> >
> >
> >>* combine.c (change_zero_ext): Handle mode expanding zero_extracts.
> 
> This was committes as r243578 and triggered (amongst other similar
> test suite ICE failures):
> 
> $ avr-gcc
> /gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c
> -S -O1 -mmcu=avr4 -S -v
> 
> /gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:
> In function 'yasm_lc3b__parse_insn':
> /gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1:
> error: insn does not satisfy its constraints:
>  }
>  ^
> (jump_insn 58 98 59 9 (set (pc)
> (if_then_else (eq (and:HI (reg:HI 31 r31)
> (const_int 1 [0x1]))
> (const_int 0 [0]))
> (label_ref 70)
> (pc))) 
> "/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
> 415 {*sbrx_and_branchhi}
>  (int_list:REG_BR_PROB 375 (nil))
>  -> 70)

Does that mean that, on avr, this is valid:

  (zero_extract:HI (reg:QI 31) ...)

but this is not:

  (subreg:HI (reg:QI 31))

?

> /gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1:
> internal compiler error: in extract_constrain_insn, at recog.c:2213
> 0x9836a3 _fatal_insn(char const*, rtx_def const*, char const*, int,
> char const*)
>   ../../../gcc.gnu.org/trunk/gcc/rtl-error.c:108
> 0x9836cf _fatal_insn_not_found(rtx_def const*, char const*, int,
> char const*)
>   ../../../gcc.gnu.org/trunk/gcc/rtl-error.c:119
> 0x95abdd extract_constrain_insn(rtx_insn*)
>   ../../../gcc.gnu.org/trunk/gcc/recog.c:2213
> 0x939105 reload_cse_simplify_operands
>   ../../../gcc.gnu.org/trunk/gcc/postreload.c:391
> 0x939ce5 reload_cse_simplify
>   ../../../gcc.gnu.org/trunk/gcc/postreload.c:179
> 0x939ce5 reload_cse_regs_1
>   ../../../gcc.gnu.org/trunk/gcc/postreload.c:218
> 0x93b96b reload_cse_regs
>   ../../../gcc.gnu.org/trunk/gcc/postreload.c:64
> 0x93b96b execute
>   ../../../gcc.gnu.org/trunk/gcc/postreload.c:2342
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
> 
> 
> Target: avr
> Configured with: ../../gcc.gnu.org/trunk/configure --target=avr
> --prefix=/local/gnu/install/gcc-7 --disable-shared --disable-nls
> --with-dwarf2 --enable-target-optspace=yes --with-gnu-as
> --with-gnu-ld --enable-checking=release --enable-languages=c,c++
> 
> 
> Combine comes up with the following insn:
> (jump_insn 58 57 59 7 (set (pc)
> (if_then_else (eq (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75
> [ operands ]) [1 *operands_17(D)+0 S1 A8]) 0)
> (const_int 1 [0x1]))
> (const_int 0 [0]))
> (label_ref 70)
> (pc))) 
> "/home/georg/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
> 415 {*sbrx_and_branchhi}
>  (int_list:REG_BR_PROB 375 (nil))
>  -> 70)
> 
> which cannot be correct because avr.md::*sbrx_and_branchhi reads:
> 
> (define_insn "*sbrx_and_branch"
>   [(set (pc)
> (if_then_else
>  (match_operator 0 "eqne_operator"
>  [(and:QISI
>(match_operand:QISI 1 "register_operand" "r")
>(match_operand:QISI 2 "single_one_operand" "n"))
>   (const_int 0)])
>  (label_ref (match_operand 3 "" ""))
>  (pc)))]
>   "" { ... } ...)
> 
> Hence we have a memory operand (subreg of mem)) where only a
> register is allowed.  Reg alloc then reloads the mem:QI into R31,
> but R31 is the
> last hard reg, i.e. R31 cannot hold HImode.
> 
> Johann
> 
> 
> 


Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



[committed] nvptx: adjust testcase for 'shared' attribute

2016-12-21 Thread Alexander Monakov
Hi,

I have applied the following testsuite patch to fix one scan-assembler failure
in the testcase for the 'shared' attribute caused by backend change to enable
-fno-common by default.

Alexander

* gcc.target/nvptx/decl-shared.c (v_common): Add 'common' attribute to
explicitly request the desired storage class.

Index: gcc.target/nvptx/decl-shared.c
===
--- gcc.target/nvptx/decl-shared.c  (revision 243855)
+++ gcc.target/nvptx/decl-shared.c  (working copy)
@@ -1,5 +1,5 @@
 static int v_internal __attribute__((shared,used));
-int v_common __attribute__((shared));
+int v_common __attribute__((shared,common));
 int v_extdef __attribute__((shared,nocommon));
 extern int v_extdecl __attribute__((shared));




[patch,testsuite,committed] ad PR testsuite/51641: Adjust some test to less endowed targets

2016-12-21 Thread Georg-Johann Lay

http://gcc.gnu.org/r243854

Applied this patch as obvious.  It adds some additional salt for
targets with reduced resources like require size32plus.

Johann

gcc/testsuite/
PR testsuite/52641
* gcc.dg/builtin-object-size-16.c (ia0, ia1, ia9): Handle case
where neither short nor int has a size of 4; use long.
* gcc.dg/builtin-object-size-17.c: Same.
* gcc.dg/builtin-stringop-chk-1.c (test2) : Use int32_t
for components as 4 components are supposed to occupy 16 bytes.
* gcc.dg/pr78408-1.c: Require target size32plus.
* gcc.dg/pr78408-2.c: Same.
* gcc.dg/tree-ssa/pr78428.c. Require target int32plus.
* gcc.dg/tree-ssa/tailcall-7.c: Require target trampolines.

Index: gcc.dg/builtin-object-size-16.c
===
--- gcc.dg/builtin-object-size-16.c	(revision 243852)
+++ gcc.dg/builtin-object-size-16.c	(working copy)
@@ -69,6 +69,10 @@ static short ia9[9];
 extern int ia0[0];
 static int ia1[1];
 static int ia9[9];
+#elif __SIZEOF_LONG__ == 4
+extern long ia0[0];
+static long ia1[1];
+static long ia9[9];
 #endif
 
 static char a2x2[2][2];
Index: gcc.dg/builtin-object-size-17.c
===
--- gcc.dg/builtin-object-size-17.c	(revision 243852)
+++ gcc.dg/builtin-object-size-17.c	(working copy)
@@ -64,6 +64,10 @@ static short ia9[9];
 extern int ia0[0];
 static int ia1[1];
 static int ia9[9];
+#elif __SIZEOF_LONG__ == 4
+extern long ia0[0];
+static long ia1[1];
+static long ia9[9];
 #endif
 
 static char a2x2[2][2];
Index: gcc.dg/builtin-stringop-chk-1.c
===
--- gcc.dg/builtin-stringop-chk-1.c	(revision 243852)
+++ gcc.dg/builtin-stringop-chk-1.c	(working copy)
@@ -105,7 +105,7 @@ test2 (const H h)
   unsigned char buf[21];
   memset (buf + 16, 0, 8); /* { dg-warning "writing 8 " "memset" } */
 
-  typedef struct { int i, j, k, l; } S;
+  typedef struct { __INT32_TYPE__ i, j, k, l; } S;
   S *s[3];
   memset (s, 0, sizeof (S) * 3); /* { dg-warning "writing 48 " "memset" } */
 
Index: gcc.dg/pr78408-1.c
===
--- gcc.dg/pr78408-1.c	(revision 243852)
+++ gcc.dg/pr78408-1.c	(working copy)
@@ -1,5 +1,5 @@
 /* PR c/78408 */
-/* { dg-do compile } */
+/* { dg-do compile { target size32plus } } */
 /* { dg-options "-O2 -fdump-tree-fab1-details" } */
 /* { dg-final { scan-tree-dump-times "after previous" 17 "fab1" } } */
 
Index: gcc.dg/pr78408-2.c
===
--- gcc.dg/pr78408-2.c	(revision 243852)
+++ gcc.dg/pr78408-2.c	(working copy)
@@ -1,5 +1,5 @@
 /* PR c/78408 */
-/* { dg-do compile } */
+/* { dg-do compile { target size32plus } } */
 /* { dg-options "-O2 -fdump-tree-fab1-details" } */
 /* { dg-final { scan-tree-dump-not "after previous" "fab1" } } */
 
Index: gcc.dg/tree-ssa/pr78428.c
===
--- gcc.dg/tree-ssa/pr78428.c	(revision 243852)
+++ gcc.dg/tree-ssa/pr78428.c	(working copy)
@@ -1,6 +1,6 @@
 /* PR tree-optimization/78428.  */
 /* { dg-options "-O2" } */
-/* { dg-do run } */
+/* { dg-do run { target int32plus } } */
 
 struct S0
 {
Index: gcc.dg/tree-ssa/tailcall-7.c
===
--- gcc.dg/tree-ssa/tailcall-7.c	(revision 243852)
+++ gcc.dg/tree-ssa/tailcall-7.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target trampolines } } */
 /* { dg-options "-O2 -fdump-tree-tailc-details" } */
 
 struct s { int x; };


Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE

2016-12-21 Thread Trevor Saunders
On Sat, Dec 17, 2016 at 01:19:41AM -0700, Jeff Law wrote:
> On 12/16/2016 12:29 AM, Trevor Saunders wrote:
> > On Thu, Dec 15, 2016 at 06:54:43PM -0700, Jeff Law wrote:
> > >unsigned cnt = 0;
> > > +  bitmap live_bytes = NULL;
> > > +  bitmap orig_live_bytes = NULL;
> > > 
> > >*use_stmt = NULL;
> > > 
> > > +  /* REF is a memory write.  Go ahead and get its base, size, extent
> > > + information and encode the bytes written into LIVE_BYTES.  We can
> > > + handle any case where we have a known base and maximum size.
> > > +
> > > + However, experimentation has shown that bit level tracking is not
> > > + useful in practice, so we only track at the byte level.
> > > +
> > > + Furthermore, experimentation has shown that 99% of the cases
> > > + that require byte tracking are 64 bytes or less.  */
> > > +  if (valid_ao_ref_for_dse (ref)
> > > +  && (ref->max_size / BITS_PER_UNIT
> > > +   <= PARAM_VALUE (PARAM_DSE_MAX_OBJECT_SIZE)))
> > > +{
> > > +  live_bytes = BITMAP_ALLOC (NULL);
> > > +  orig_live_bytes = BITMAP_ALLOC (NULL);
> > > +  bitmap_set_range (live_bytes,
> > > + ref->offset / BITS_PER_UNIT,
> > > + ref->max_size / BITS_PER_UNIT);
> > > +  bitmap_copy (orig_live_bytes, live_bytes);
> > 
> > would it maybe make sense to use sbitmaps since the length is known to
> > be short, and doesn't change after allocation?
> So a few interesting things have to be dealt if we want to make this change.
> I already mentioned the need to bias based on ref->offset so that the range
> of bytes we're tracking is represented 0..size.
> 
> While we know the length of the potential dead store we don't know the
> length of the subsequent stores that we're hoping make the original a dead
> store.  Thus when we start clearing LIVE_BYTES based on those subsequent
> stores, we have to normalize those against the ref->offset of the original
> store.
> 
> What's even more fun is sizing.  Those subsequent stores may be considerably
> larger.  Which means that a bitmap_clear_range call has to be a hell of a
> lot more careful when working with sbitmaps (we just happily stop all over
> memory in that case) whereas a bitmap the right things will "just happen".
> 
> On a positive size since we've normalized the potential dead store's byte
> range to 0..size, it means computing trims is easier because we inherently
> know how many bits were originally set.  So compute_trims becomes trivial
> and we can simplify trim_complex_store a bit as well.
> 
> And, of course, we don't have a bitmap_{clear,set}_range or a
> bitmap_count_bits implementation for sbitmaps.
> 
> 
> It's all a bit of "ugh", but should be manageable.

yeah, but that seems like enough work that you could reasonably stick
with bitmap instead.

Trev

p.s. sorry I've been falling behind lately.

> 
> Jeff


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
> Now when I think about this some more, I have a vague recollection
> that a long time ago it used to be something like that.  The problem
> is that MIN_EXPR will of course be
> NON-CONSTANT, so the memcpy call can't be inlined. Hence it was
> changed to two separate __builtin_memmove() calls to have better
> opportunity to inline. So probably a no-go to change it back. :(

I don't get that. From the former only one of the memmove's could have been
inlined assuming that only CONSTANT sizes are inlined. The other one had a
NON-CONSTANT as long as pointer following and constant propagation was not
effective together. In our case the "NON-CONSTANT branch" would have been used,
which is resolved by constant propagation (of the size of constant memory p
points to). I assume the warning is triggered, because dead-code elimination
has not removed the else part. 

Following this thought the MIN_EXPR would be propagated to 5 and the inliner
can do its magic. Albeit it may be that now some other optimization level will
trigger a warning, because some part has not been removed/constant replaced.
What do you think of that?

-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: Fix for libstdc++-v3's error_constants.h for MinGW-W64

2016-12-21 Thread Jonathan Wakely

On 16/12/16 20:23 +, Jonathan Wakely wrote:

On 16/12/16 16:28 +0300, niXman wrote:

Jonathan Wakely 2016-12-16 16:04:


I don't think this is suitable for the branches, but could be applied
to trunk (as the patch was posted during stage 1, but I missed it).

Ok.


Does this require a particular version of MinGW-w64?

Yes, at the moment MinGW-W64 trunk is required. (MinGW-W64-v6)


Does it work for older versions?

No.


OK, how about this instead then?

This unconditionally enables all the error codes supported by
mingw-w64 v5, and uses configure checks to detect the ones that are
only available on trunk. This means it will still work for v5, but if
you use trunk you get them all.


I'm committing this. It should work for mingw-w64 v3.0.0 and later,
but some constants are only defined for mingw-w64 trunk.

Tested x86_64-linux, committed to trunk.


commit 5615ca1afbe84adb8335c7b6e44b4619ceac129d
Author: Jonathan Wakely 
Date:   Wed Dec 21 12:44:14 2016 +

PR 71444 define more error constants for mingw-w64

	PR libstdc++/71444
	* config/os/mingw32-w64/error_constants.h
	(address_family_not_supported, address_in_use, address_not_available)
	(already_connected, connection_aborted, connection_already_in_progress)
	connection_refused, connection_reset, cross_device_link)
	(destination_address_required, host_unreachable, message_size)
	(network_down, network_reset, network_unreachable, no_buffer_space)
	(no_protocol_option, not_a_socket, not_connected, operation_canceled)
	(operation_in_progress, operation_not_supported, protocol_error)
	(protocol_not_supported, too_many_links, too_many_symbolic_link_levels)
	(value_too_large, wrong_protocol_type): Define.
	(bad_message, identifier_removed, no_link, no_message_available)
	(no_message, no_stream_resources, not_a_stream, owner_dead)
	(state_not_recoverable, stream_timeout, text_file_busy): Define
	conditionally.
	* testsuite/19_diagnostics/headers/system_error/errc_std_c++0x.cc:
	Guard test for no_message with _GLIBCXX_HAVE_ENOMSG.

diff --git a/libstdc++-v3/config/os/mingw32-w64/error_constants.h b/libstdc++-v3/config/os/mingw32-w64/error_constants.h
index 5cbf63c..f100373 100644
--- a/libstdc++-v3/config/os/mingw32-w64/error_constants.h
+++ b/libstdc++-v3/config/os/mingw32-w64/error_constants.h
@@ -41,22 +41,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 // replaced by Winsock WSA-prefixed equivalents.
   enum class errc
 {
-//address_family_not_supported = 		EAFNOSUPPORT,
-//address_in_use = EADDRINUSE,
-//address_not_available = 			EADDRNOTAVAIL,
-//already_connected = 			EISCONN,
+  address_family_not_supported = 		EAFNOSUPPORT,
+  address_in_use = EADDRINUSE,
+  address_not_available = 			EADDRNOTAVAIL,
+  already_connected = 			EISCONN,
   argument_list_too_long = 			E2BIG,
   argument_out_of_domain = 			EDOM,
   bad_address = EFAULT,
   bad_file_descriptor = 			EBADF,
-//bad_message = EBADMSG,
+#ifdef _GLIBCXX_HAVE_EBADMSG
+  bad_message = EBADMSG,
+#endif
   broken_pipe = EPIPE,
-//connection_aborted = 			ECONNABORTED,
-//connection_already_in_progress = 		EALREADY,
-//connection_refused = 			ECONNREFUSED,
-//connection_reset = 			ECONNRESET,
-//cross_device_link = 			EXDEV,
-//destination_address_required = 		EDESTADDRREQ,
+  connection_aborted = 			ECONNABORTED,
+  connection_already_in_progress = 		EALREADY,
+  connection_refused = 			ECONNREFUSED,
+  connection_reset = 			ECONNRESET,
+  cross_device_link = 			EXDEV,
+  destination_address_required = 		EDESTADDRREQ,
   device_or_resource_busy = 		EBUSY,
   directory_not_empty = 			ENOTEMPTY,
   executable_format_error = 		ENOEXEC,
@@ -64,8 +66,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   file_too_large = EFBIG,
   filename_too_long = 			ENAMETOOLONG,
   function_not_supported = 			ENOSYS,
-//host_unreachable = 			EHOSTUNREACH,
-//identifier_removed = 			EIDRM,
+  host_unreachable = 			EHOSTUNREACH,
+#ifdef _GLIBCXX_HAVE_EIDRM
+  identifier_removed = 			EIDRM,
+#endif
   illegal_byte_sequence = 			EILSEQ,
   inappropriate_io_control_operation = 	ENOTTY,
   interrupted = EINTR,
@@ -73,67 +77,84 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   invalid_seek = ESPIPE,
   io_error = EIO,
   is_a_directory = EISDIR,
-//message_size = EMSGSIZE,
-//network_down = ENETDOWN,
-//network_reset = ENETRESET,
-//network_unreachable = 			ENETUNREACH,
-//no_buffer_space = 			ENOBUFS,
+  message_size = EMSGSIZE,
+  network_down = ENETDOWN,
+  network_reset = ENETRESET,
+  network_unreachable = 			ENETUNREACH,
+  no_buffer_space = 			ENOBUFS,
 #ifdef _GLIBCXX_HAVE_ECHILD
   no_child_process = 			ECHILD,
 #endif
-//no_link = ENOLINK,
+#ifdef 

Re: [PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.

2016-12-21 Thread Georg-Johann Lay

On 12.12.2016 17:54, Segher Boessenkool wrote:

On Mon, Dec 12, 2016 at 05:46:02PM +0100, Dominik Vogt wrote:

Patch with these changes and a fix because of not handling
VOIDmode attached.  Bootstrapped and regression tested on s390 and
s390x.


Okay for trunk.

When did you see VOIDmode, btw?  It wasn't on a const_int I hope?


Segher



* combine.c (change_zero_ext): Handle mode expanding zero_extracts.


This was committes as r243578 and triggered (amongst other similar
test suite ICE failures):

$ avr-gcc 
/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c -S 
-O1 -mmcu=avr4 -S -v


/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c: In 
function 'yasm_lc3b__parse_insn':
/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1: 
error: insn does not satisfy its constraints:

 }
 ^
(jump_insn 58 98 59 9 (set (pc)
(if_then_else (eq (and:HI (reg:HI 31 r31)
(const_int 1 [0x1]))
(const_int 0 [0]))
(label_ref 70)
(pc))) 
"/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11 
415 {*sbrx_and_branchhi}

 (int_list:REG_BR_PROB 375 (nil))
 -> 70)
/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1: 
internal compiler error: in extract_constrain_insn, at recog.c:2213
0x9836a3 _fatal_insn(char const*, rtx_def const*, char const*, int, char 
const*)

../../../gcc.gnu.org/trunk/gcc/rtl-error.c:108
0x9836cf _fatal_insn_not_found(rtx_def const*, char const*, int, char 
const*)

../../../gcc.gnu.org/trunk/gcc/rtl-error.c:119
0x95abdd extract_constrain_insn(rtx_insn*)
../../../gcc.gnu.org/trunk/gcc/recog.c:2213
0x939105 reload_cse_simplify_operands
../../../gcc.gnu.org/trunk/gcc/postreload.c:391
0x939ce5 reload_cse_simplify
../../../gcc.gnu.org/trunk/gcc/postreload.c:179
0x939ce5 reload_cse_regs_1
../../../gcc.gnu.org/trunk/gcc/postreload.c:218
0x93b96b reload_cse_regs
../../../gcc.gnu.org/trunk/gcc/postreload.c:64
0x93b96b execute
../../../gcc.gnu.org/trunk/gcc/postreload.c:2342
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.


Target: avr
Configured with: ../../gcc.gnu.org/trunk/configure --target=avr 
--prefix=/local/gnu/install/gcc-7 --disable-shared --disable-nls 
--with-dwarf2 --enable-target-optspace=yes --with-gnu-as --with-gnu-ld 
--enable-checking=release --enable-languages=c,c++



Combine comes up with the following insn:
(jump_insn 58 57 59 7 (set (pc)
(if_then_else (eq (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [ 
operands ]) [1 *operands_17(D)+0 S1 A8]) 0)

(const_int 1 [0x1]))
(const_int 0 [0]))
(label_ref 70)
(pc))) 
"/home/georg/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11 
415 {*sbrx_and_branchhi}

 (int_list:REG_BR_PROB 375 (nil))
 -> 70)

which cannot be correct because avr.md::*sbrx_and_branchhi reads:

(define_insn "*sbrx_and_branch"
  [(set (pc)
(if_then_else
 (match_operator 0 "eqne_operator"
 [(and:QISI
   (match_operand:QISI 1 "register_operand" "r")
   (match_operand:QISI 2 "single_one_operand" "n"))
  (const_int 0)])
 (label_ref (match_operand 3 "" ""))
 (pc)))]
  "" { ... } ...)

Hence we have a memory operand (subreg of mem)) where only a register is 
allowed.  Reg alloc then reloads the mem:QI into R31, but R31 is the

last hard reg, i.e. R31 cannot hold HImode.

Johann




Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 1:50 PM, Andre Vehreschild  wrote:
>> Here p is the character variable, and _p is the charlen. My guess is
>> that the problem is that with -O1 it sees that the second memmove
>> would overflow p, but it doesn't realize that branch is never taken.
>> Cranking up the optimization level to -O2 and beyond makes it realize
>> it, and thus the warning disappears.
>>
>> Perhaps one could rewrite that to something like
>>
>> __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 sz: 1},
>> MIN_EXPR<(unsigned long) D.3598,8>);
>> if ((unsigned long) D.3598 > 8)
>>   {
>>   __builtin_memset ((void*) *p + 8, 32, D.3599);
>>   }
>
> That looks interesting. It assumes though, that D.3598 will *never* be
> negative. Because when it is negative 8 characters (cast to unsigned makes the
> negative number huge) will be copied, while in the former code memmove will
> reject the coping of a negative number of bytes. Therefore I propose to omit
> the cast in the MIN_EXPR and make the constant 8 signed, too. That should
> comply and mimick the former behavior more closely. What do you think? Who's
> going to try?

Now when I think about this some more, I have a vague recollection
that a long time ago it used to be something like that.  The problem
is that MIN_EXPR will of course be
NON-CONSTANT, so the memcpy call can't be inlined. Hence it was
changed to two separate __builtin_memmove() calls to have better
opportunity to inline. So probably a no-go to change it back. :(

-- 
Janne Blomqvist


Re: [PATCH][ARM] Updating testcase unsigned-extend-2.c

2016-12-21 Thread Kyrill Tkachov

Hi Andre,

On 21/06/16 15:16, Andre Vieira (lists) wrote:

Hello,

After some changes to GCC this test no longer tests the desired code
generation behavior. The generated assembly is better than it used to
be, but it has become too smart. I add an extra parameter to make sure
GCC can't optimize away the loop.

Tested for arm-none-eabi-gcc with a Cortex-M3 target.

Is this OK?


Ok if this passes on an A-profile target as well (-march=armv7-a or some such).
Sorry for missing this.
Thanks,
Kyrill


Cheers,
Andre

gcc/ChangeLog
2016-06-21  Andre Vieira  

 * gcc.target/arm/unsigned-extend-2.c: Update testcase.




[Ping 2][PATCH][ARM] Updating testcase unsigned-extend-2.c

2016-12-21 Thread Andre Vieira (lists)
On 12/12/16 14:20, Andre Vieira (lists) wrote:
> On 21/06/16 15:16, Andre Vieira (lists) wrote:
>> Hello,
>>
>> After some changes to GCC this test no longer tests the desired code
>> generation behavior. The generated assembly is better than it used to
>> be, but it has become too smart. I add an extra parameter to make sure
>> GCC can't optimize away the loop.
>>
>> Tested for arm-none-eabi-gcc with a Cortex-M3 target.
>>
>> Is this OK?
>>
>> Cheers,
>> Andre
>>
>> gcc/ChangeLog
>> 2016-06-21  Andre Vieira  
>>
>> * gcc.target/arm/unsigned-extend-2.c: Update testcase.
>>
> 
> Ping.
> 
Ping.


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
> Here p is the character variable, and _p is the charlen. My guess is
> that the problem is that with -O1 it sees that the second memmove
> would overflow p, but it doesn't realize that branch is never taken.
> Cranking up the optimization level to -O2 and beyond makes it realize
> it, and thus the warning disappears.
> 
> Perhaps one could rewrite that to something like
> 
> __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 sz: 1},
> MIN_EXPR<(unsigned long) D.3598,8>);
> if ((unsigned long) D.3598 > 8)
>   {
>   __builtin_memset ((void*) *p + 8, 32, D.3599);
>   }

That looks interesting. It assumes though, that D.3598 will *never* be
negative. Because when it is negative 8 characters (cast to unsigned makes the
negative number huge) will be copied, while in the former code memmove will
reject the coping of a negative number of bytes. Therefore I propose to omit
the cast in the MIN_EXPR and make the constant 8 signed, too. That should
comply and mimick the former behavior more closely. What do you think? Who's
going to try?

-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


C++ patch ping

2016-12-21 Thread Jakub Jelinek
Hi!

I'd like to ping the PR77830 fix for out of bounds constexpr stores:
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01319.html

Jakub


RE: [PATCH, testsuite] MIPS: Relax instruction order check in msa-builtins.c.

2016-12-21 Thread Toma Tabacu
> Catherine Moore writes:
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/mips/msa-builtins.c (dg-final): Tweak regex for the
> > 32-bit
> > insert.d case.
> 
> Please change to:
>   * gcc.target/mips-msa-builtins.c (msa_insert_d): Tweak expected
> output.
> 
> Okay with that change.
> Thanks,
> Catherine
> 

Committed as r243848.

Regards,
Toma


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 12:33 PM, Andre Vehreschild  wrote:
> Hi Janne,
>
>> But yes, I'm still seeing the warning messages with -O1 for
>> allocate_deferred_char_scalar_1.f03. AFAICT it's a bogus warning, but
>> I don't know how to get rid of it..
>
> No, that warning isn't at all bogus. The warning in fact is astonishingly
> precise. When I remember correctly, then the warning complains about trying to
> put a string of length 8 into memory of length 5. There is never a memory
> access error at runtime, because the code generated ensures that only 5 chars
> are copied, but I am impressed by the analysis done by some intermediate step
> of gcc. It figures, that memory is available for 5 characters only derefing a
> "static/constant" pointer and then learning that initially 8 chars are to be
> copied. I already tried to fix this by only generating code to copy the 5
> characters and make this knowledge available to the gimplifier, but I failed 
> to
> deref the pointer and get the information statically. So IMHO the warning is
> not bogus. It is absolutely correct and it is quite sophisticated to learn all
> the necessary facts, but I didn't find a way to get this done in the 
> front-end.
> We might be able to prevent the warning when there is a chance to add some 
> hook
> into the middle stages of the compiler, telling it, that everything is fine.
> But I have no idea what is possible and available there.

I suspect it's complaining about (from the -fdump-tree-original):

  {
integer(kind=8) D.3598;
unsigned long D.3599;

D.3598 = *_p;
D.3599 = (unsigned long) D.3598 + 18446744073709551608;
if (D.3598 != 0)
  {
if ((unsigned long) D.3598 <= 8)
  {
__builtin_memmove ((void *) *p, (void *)
&"12345679"[1]{lb: 1 sz: 1}, (unsigned long) D.3598);
  }
else
  {
__builtin_memmove ((void *) *p, (void *)
&"12345679"[1]{lb: 1 sz: 1}, 8);
__builtin_memset ((void *) *p + 8, 32, D.3599);
  }
  }
  }

Here p is the character variable, and _p is the charlen. My guess is
that the problem is that with -O1 it sees that the second memmove
would overflow p, but it doesn't realize that branch is never taken.
Cranking up the optimization level to -O2 and beyond makes it realize
it, and thus the warning disappears.

Perhaps one could rewrite that to something like

__builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 sz: 1},
MIN_EXPR<(unsigned long) D.3598,8>);
if ((unsigned long) D.3598 > 8)
  {
  __builtin_memset ((void*) *p + 8, 32, D.3599);
  }


?



-- 
Janne Blomqvist


Re: [libgfortran, patch] Remove unused headers

2016-12-21 Thread FX
> Ok. Watch out for regressions on other targets though. I have more
> than once caused regressions by using some function that happened to
> be included in the existing headers on glibc, but needed some other
> header on other targets.

Committed: https://gcc.gnu.org/viewcvs/gcc?view=revision=243844
I’ve tried to choose based on standards and not glibc, but yes, I will watch 
out for regressions.


> Yes, sounds reasonable to include stdlib.h, yes.

I have committed a patch handling stdlib.h: 
https://gcc.gnu.org/viewcvs/gcc?view=revision=243846

I have also committed a patch removing some unused limits.h in minloc/maxloc, 
limits.h and errno.h from runtime/minimal.c, and some includes of headers 
already pulled in from libgfortran.h (math.h from norm and parity, sys/types.h 
from random.c). Commit message: 
https://gcc.gnu.org/viewcvs/gcc?view=revision=243847

I have not touched caf/* files, as duplication of includes there seems 
intentional.

FX

Re: [PATCH] Don't bootstrap libmpx unless needed

2016-12-21 Thread Richard Biener
On December 20, 2016 7:59:11 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>Similarly how we deal with bootstrapping libsanitizer only when
>doing bootstrap-{a,u}san bootstrap, this avoids bootstrapping libmpx
>if we don't need it for bootstrapping.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

>2016-12-20  Jakub Jelinek  
>
>   * configure.ac: Don't bootstrap libmpx unless --with-build-config
>   includes bootstrap-mpx.
>   * configure: Regenerated.
>
>--- configure.ac.jj1   2016-12-01 23:24:53.0 +0100
>+++ configure.ac   2016-12-20 10:50:08.715213438 +0100
>@@ -2643,9 +2643,14 @@ if echo " ${target_configdirs} " | grep
>   bootstrap_target_libs=${bootstrap_target_libs}target-libvtv,
> fi
> 
>-# If we are building libmpx, bootstrap it.
>+# If we are building libmpx and $BUILD_CONFIG contains bootstrap-mpx,
>+# bootstrap it.
>if echo " ${target_configdirs} " | grep " libmpx " > /dev/null 2>&1;
>then
>-  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
>+  case "$BUILD_CONFIG" in
>+*bootstrap-mpx* )
>+  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
>+  ;;
>+  esac
> fi
> 
> # Determine whether gdb needs tk/tcl or not.
>--- configure.jj1  2016-12-02 00:15:10.0 +0100
>+++ configure  2016-12-20 10:50:22.503034682 +0100
>@@ -7057,9 +7057,14 @@ if echo " ${target_configdirs} " | grep
>   bootstrap_target_libs=${bootstrap_target_libs}target-libvtv,
> fi
> 
>-# If we are building libmpx, bootstrap it.
>+# If we are building libmpx and $BUILD_CONFIG contains bootstrap-mpx,
>+# bootstrap it.
>if echo " ${target_configdirs} " | grep " libmpx " > /dev/null 2>&1;
>then
>-  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
>+  case "$BUILD_CONFIG" in
>+*bootstrap-mpx* )
>+  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
>+  ;;
>+  esac
> fi
> 
> # Determine whether gdb needs tk/tcl or not.
>
>   Jakub




Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
Hi Janne,

> But yes, I'm still seeing the warning messages with -O1 for
> allocate_deferred_char_scalar_1.f03. AFAICT it's a bogus warning, but
> I don't know how to get rid of it..

No, that warning isn't at all bogus. The warning in fact is astonishingly
precise. When I remember correctly, then the warning complains about trying to
put a string of length 8 into memory of length 5. There is never a memory
access error at runtime, because the code generated ensures that only 5 chars
are copied, but I am impressed by the analysis done by some intermediate step
of gcc. It figures, that memory is available for 5 characters only derefing a
"static/constant" pointer and then learning that initially 8 chars are to be
copied. I already tried to fix this by only generating code to copy the 5
characters and make this knowledge available to the gimplifier, but I failed to
deref the pointer and get the information statically. So IMHO the warning is
not bogus. It is absolutely correct and it is quite sophisticated to learn all
the necessary facts, but I didn't find a way to get this done in the front-end.
We might be able to prevent the warning when there is a chance to add some hook
into the middle stages of the compiler, telling it, that everything is fine.
But I have no idea what is possible and available there.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] Do not suggest -fsanitize=all (PR driver/78863).

2016-12-21 Thread Jakub Jelinek
On Wed, Dec 21, 2016 at 11:20:33AM +0100, Martin Liška wrote:
> I like your approach!
> make check -k -j10 RUNTESTFLAGS="dg.exp=spellcheck-options-*" works fine.
> 
> Am I install the patch after it survives proper regression tests?

Ok.

Also, only related, seems we have misspelling candidates for cases like
-fsanitiz=ell
but not for -fsanitize=ell
(i.e. when the option is actually correct, just the argument to it (or part
of it) is misspelled).  It would need to be done probably in
parse_sanitizer_options when we diagnose it:
  if (! found && complain)
error_at (loc, "unrecognized argument to -fsanitize%s= option: %q.*s",
  code == OPT_fsanitize_ ? "" : "-recover", (int) len, p);
go through sanitizer_opts again in that case, add candidates (that are
valid for the particular option), and if there is a hint, add the hint to
this message.

Jakub


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 12:05 PM, Andre Vehreschild  wrote:
> Hi all,
>
> so I have learned that proposing to write "speaking code" is not very well
> taken.

If you want to make a patch introducing gfc_size_t_zero_node, go
ahead, at least I won't object.  I don't think
build_zero_cst(size_type_node) is that terrible myself, but I don't
have any hard opinions on this.

> Anyway, there is a patch (in two parts) hanging about changing the character
> length from int to size_t. It looks ok to me, but I do not have the privilege
> to ok it. Furthermore am I still not convinced that we can't do anything about
> the failing testcase allocate_deferred_char_scalar_1. So how do we proceed?

I have just verified that my fix for PR 78867 fixes the -flto failures
Dominique noticed. I have some other minor cleanup to do to the
charlen->size_t patch, and then I'll resubmit it.

But yes, I'm still seeing the warning messages with -O1 for
allocate_deferred_char_scalar_1.f03. AFAICT it's a bogus warning, but
I don't know how to get rid of it..


-- 
Janne Blomqvist


Re: [PATCH] Do not suggest -fsanitize=all (PR driver/78863).

2016-12-21 Thread Martin Liška
On 12/21/2016 11:00 AM, Jakub Jelinek wrote:
> On Wed, Dec 21, 2016 at 10:34:13AM +0100, Martin Liška wrote:
>> As mentioned in the PR, we should not suggest option that is not allowed.
>> Fixed by explicit removal of suggestions that are not acceptable.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
> 
> Wouldn't it be better not to register those? Like (untested):
> 
> --- gcc/gcc.c.jj  2016-11-14 19:57:10.0 +0100
> +++ gcc/gcc.c 2016-12-21 10:58:29.739873850 +0100
> @@ -7733,6 +7733,17 @@ driver::build_option_suggestions (void)
> {
>   for (int j = 0; sanitizer_opts[j].name != NULL; ++j)
> {
> + struct cl_option optb;
> + /* -fsanitize=all is not valid, only -fno-sanitize=all.
> +So don't register the positive misspelling candidates
> +for it.  */
> + if (sanitizer_opts[j].flag == ~0U && i == OPT_fsanitize_)
> +   {
> + optb = *option;
> + optb.opt_text = opt_text = "-fno-sanitize=";
> + optb.cl_reject_negative = true;
> + option = 
> +   }
>   /* Get one arg at a time e.g. "-fsanitize=address".  */
>   char *with_arg = concat (opt_text,
>sanitizer_opts[j].name,
> 
> 
>   Jakub
> 

I like your approach!
make check -k -j10 RUNTESTFLAGS="dg.exp=spellcheck-options-*" works fine.

Am I install the patch after it survives proper regression tests?
Thanks,
Martin
>From 2533c19b4bbd2d9900b043973b504be07343d05c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 20 Dec 2016 12:16:02 +0100
Subject: [PATCH] Do not suggest -fsanitize=all (PR driver/78863).

gcc/ChangeLog:

2016-12-20  Jakub Jelinek  
	Martin Liska  

	PR driver/78863
	* gcc.c (driver::build_option_suggestions): Do not add
	-fsanitize=all as a suggestion candidate.

gcc/testsuite/ChangeLog:

2016-12-20  Martin Liska  

	PR driver/78863
	* gcc.dg/spellcheck-options-13.c: New test.
---
 gcc/gcc.c| 11 +++
 gcc/testsuite/gcc.dg/spellcheck-options-13.c |  5 +
 2 files changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-options-13.c

diff --git a/gcc/gcc.c b/gcc/gcc.c
index f78acd68606..69089484340 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -7733,6 +7733,17 @@ driver::build_option_suggestions (void)
 	  {
 	for (int j = 0; sanitizer_opts[j].name != NULL; ++j)
 	  {
+		struct cl_option optb;
+		/* -fsanitize=all is not valid, only -fno-sanitize=all.
+		   So don't register the positive misspelling candidates
+		   for it.  */
+		if (sanitizer_opts[j].flag == ~0U && i == OPT_fsanitize_)
+		  {
+		optb = *option;
+		optb.opt_text = opt_text = "-fno-sanitize=";
+		optb.cl_reject_negative = true;
+		option = 
+		  }
 		/* Get one arg at a time e.g. "-fsanitize=address".  */
 		char *with_arg = concat (opt_text,
 	 sanitizer_opts[j].name,
diff --git a/gcc/testsuite/gcc.dg/spellcheck-options-13.c b/gcc/testsuite/gcc.dg/spellcheck-options-13.c
new file mode 100644
index 000..19b63af565b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/spellcheck-options-13.c
@@ -0,0 +1,5 @@
+/* PR driver/78863.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fsanitize" } */
+/* { dg-error "unrecognized command line option .-fsanitize..$" "" { target *-*-* } 0 } */
-- 
2.11.0



Re: [i386] New lea patterns for PR71321

2016-12-21 Thread Uros Bizjak
On Tue, Dec 20, 2016 at 10:29 PM, Bernd Schmidt  wrote:
> The problem here is that we don't have complete coverage of lea patterns for
> HImode/QImode: the combiner can't recognize a (plus (ashift reg 2) reg)
> pattern it builds.
>
> My first idea was to canonicalize ASHIFT by constant inside PLUS to MULT.
> The docs say that this is done only inside a MEM context, but that seems
> misguided considering the existence of lea patterns. Maybe that's something
> to revisit for gcc-8. In the meantime, the patch below is more like a
> minimal fix, and it seems to produce better results at the moment anyway.

Actually, as mentioned in ix86_decompose_address, we have to handle
ASHIFT anyway, since combine is quite creative when creating various
combinations of PLUS, MULT and ASHIFT in the LEA context. But since
ASHIFT handling is relatively minor functional addition, we just
handle everything in the decomposition, with the side effect that
combine sometimes doesn't canonicalize the combinations involving
ASHIFT.

> Bootstrapped and tested on x86_64-linux. Ok?

OK.

Thanks,
Uros.


Re: [libgfortran, patch] Remove unused headers

2016-12-21 Thread Janne Blomqvist
On Wed, Dec 21, 2016 at 12:05 PM, FX  wrote:
> The attached patch removes some unused headers from libgfortran sources. It 
> removes 320 assert.h, 31 stdlib.h, 12 string.h, 4 errno.h, and the occasional 
> math.h, unistd.h, and limits.h.
>
> I have audited all files manually, checking for function calls. Patch was 
> bootstrapped and regtested on x86_64-apple-darwin16.3.0.
> OK to commit?

Ok. Watch out for regressions on other targets though. I have more
than once caused regressions by using some function that happened to
be included in the existing headers on glibc, but needed some other
header on other targets.


> PS: after that patch, the most often used header (not included in 
> libgfortran.h) is stdlib.h, used in 351 out of 436 library source files. 
> Maybe that warrants including it in libgfortran.h? (next are string.h with 
> 129 includes and limits.h with 113)

Yes, sounds reasonable to include stdlib.h, yes.

I have occasionally thought about splitting libgfortran.h, it's
annoying when one does some minor change affecting maybe 1 or 2 .o
files, but the entire libgfortran has to be rebuilt.. But beyond being
annoyed, I haven't done anything further about it.



-- 
Janne Blomqvist


Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-21 Thread Yuri Rumyantsev
Hi Richard,

I occasionally found out a bug in my patch related to epilogue
vectorization without masking : need to put label before
initialization.

Could you please review and integrate it to trunk. Test-case is also attached.


Thanks ahead.
Yuri.

ChangeLog:
2016-12-21  Yuri Rumyantsev  

* tree-vectorizer.c (vectorize_loops): Put label before initialization
of loop_vectorized_call.

gcc/testsuite/

* gcc.dg/vect/vect-tail-nomask-2.c: New test.

2016-12-13 16:59 GMT+03:00 Richard Biener :
> On Mon, 12 Dec 2016, Yuri Rumyantsev wrote:
>
>> Richard,
>>
>> Could you please review cost model patch before to include it to
>> epilogue masking patch and add masking cost estimation as you
>> requested.
>
> That's just the middle-end / target changes.  I was not 100% happy
> with them but well, the vectorizer cost modeling needs work
> (aka another rewrite).
>
> From below...
>
>> Thanks.
>>
>> Patch and ChangeLog are attached.
>>
>> 2016-12-12 15:47 GMT+03:00 Yuri Rumyantsev :
>> > Hi Richard,
>> >
>> > You asked me about performance of spec2006 on AVX2 machine with new 
>> > feature.
>> >
>> > I tried the following on Haswell using original patch designed by Ilya.
>> > 1. Masking low trip count loops  only 6 benchmarks are affected and
>> > performance is almost the same
>> > 464.h264ref 63.900064. +0.15%
>> > 416.gamess  42.900042.9000 +0%
>> > 435.gromacs 32.800032.7000 -0.30%
>> > 447.dealII  68.500068.3000 -0.29%
>> > 453.povray  61.900062.1000 +0.32%
>> > 454.calculix39.800039.8000 +0%
>> > 465.tonto   29.900029.9000 +0%
>> >
>> > 2. epilogue vectorization without masking (use less vf) (3 benchmarks
>> > are not affected)
>> > 400.perlbench 47.200046.5000 -1.48%
>> > 401.bzip2 29.900029.9000 +0%
>> > 403.gcc   41.800041.6000 -0.47%
>> > 456.hmmer 32.32. +0%
>> > 462.libquantum81.500082. +0.61%
>> > 464.h264ref   65.65.5000 +0.76%
>> > 471.omnetpp   27.800028.2000 +1.43%
>> > 473.astar 28.700028.6000 -0.34%
>> > 483.xalancbmk 48.700048.6000 -0.20%
>> > 410.bwaves95.300095.3000 +0%
>> > 416.gamess42.900042.8000 -0.23%
>> > 433.milc  38.800038.8000 +0%
>> > 434.zeusmp51.700051.4000 -0.58%
>> > 435.gromacs   32.800032.8000 +0%
>> > 436.cactusADM 85.83. -2.35%
>> > 437.leslie3d  55.500055.5000 +0%
>> > 444.namd  31.300031.3000 +0%
>> > 447.dealII68.700068.9000 +0.29%
>> > 450.soplex47.300047.4000 +0.21%
>> > 453.povray62.100061.4000 -1.12%
>> > 454.calculix  39.700039.3000 -1.00%
>> > 459.GemsFDTD  44.900045. +0.22%
>> > 465.tonto 29.800029.8000 +0%
>> > 481.wrf   51.51.2000 +0.39%
>> > 482.sphinx3   69.800071.2000 +2.00%
>
> I see 471.omnetpp and 482.sphinx3 are in a similar ballpark and it
> would be nice to catch the relevant case(s) with a cost model for
> epilogue vectorization without masking first (to get rid of
> --param vect-epilogues-nomask).
>
> As said elsewhere any non-conservative cost modeling (if the
> number of scalar iterations is not statically constant) might
> require versioning of the loop into a non-vectorized,
> short-trip vectorized and regular vectorized case (the Intel
> compiler does way more aggressive versioning IIRC).
>
> Richard.
>
>> > 3. epilogue vectorization using masking (4 benchmarks are not affected):
>> > 400.perlbench 47.500046.8000 -1.47%
>> > 401.bzip2 30.29.9000 -0.33%
>> > 403.gcc   42.300042.3000 +0%
>> > 445.gobmk 32.100032.8000 +2.18%
>> > 456.hmmer 32.32. +0%
>> > 458.sjeng 36.100035.5000 -1.66%
>> > 462.libquantum81.100081.1000 +0%
>> > 464.h264ref   65.400065. -0.61%
>> > 483.xalancbmk 49.400049.3000 -0.20%
>> > 410.bwaves95.900095.5000 -0.41%
>> > 416.gamess42.800042.6000 -0.46%
>> > 433.milc  38.800039.1000 +0.77%
>> > 434.zeusmp52.100051.3000 -1.53%
>> > 435.gromacs   32.900032.9000 +0%
>> > 436.cactusADM 78.800085.3000 +8.24%
>> > 437.leslie3d  55.400055.4000 +0%
>> > 444.namd  31.300031.3000 +0%
>> > 447.dealII69.69.2000 +0.28%
>> > 450.soplex47.700047.6000 -0.20%
>> > 453.povray62.200061.7000 -0.80%
>> > 454.calculix  39.700038.2000 -3.77%
>> > 459.GemsFDTD  44.900045. +0.22%
>> > 465.tonto 29.800029.9000 +0.33%
>> > 481.wrf   51.200051.6000 +0.78%
>> > 482.sphinx3   70.300065.4000 -6.97%
>> >
>> > There is a good speed-up for 436 but there is essential slow0down on 482, 
>> > 454.
>> >
>> > So In general we don't have any advantages for AVX2.
>> >
>> > Best regards.
>> > 

[libgfortran, patch] Remove unused headers

2016-12-21 Thread FX
The attached patch removes some unused headers from libgfortran sources. It 
removes 320 assert.h, 31 stdlib.h, 12 string.h, 4 errno.h, and the occasional 
math.h, unistd.h, and limits.h.

I have audited all files manually, checking for function calls. Patch was 
bootstrapped and regtested on x86_64-apple-darwin16.3.0.
OK to commit?

FX


PS: after that patch, the most often used header (not included in 
libgfortran.h) is stdlib.h, used in 351 out of 436 library source files. Maybe 
that warrants including it in libgfortran.h? (next are string.h with 129 
includes and limits.h with 113)




headers.ChangeLog
Description: Binary data


headers.diff
Description: Binary data


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-21 Thread Andre Vehreschild
Hi all,

so I have learned that proposing to write "speaking code" is not very well
taken.

Anyway, there is a patch (in two parts) hanging about changing the character
length from int to size_t. It looks ok to me, but I do not have the privilege
to ok it. Furthermore am I still not convinced that we can't do anything about
the failing testcase allocate_deferred_char_scalar_1. So how do we proceed?

- Andre

On Tue, 20 Dec 2016 17:08:54 +0100
Jakub Jelinek  wrote:

> On Tue, Dec 20, 2016 at 05:04:54PM +0100, Andre Vehreschild wrote:
> > Well, then how about:
> > 
> > #define gfc_size_t_zero_node build_int_cst (size_type_node, 0)
> > 
> > We can't get any faster and for new and old gfortran-hackers one
> > identifier's meaning is faster to grasp than two's.  
> 
> Such macros can cause various maintenance issues, so I'm not in favor of
> that.  But if you as fortran maintainers want it, I won't object strongly.
> 
>   Jakub


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] Do not suggest -fsanitize=all (PR driver/78863).

2016-12-21 Thread Jakub Jelinek
On Wed, Dec 21, 2016 at 10:34:13AM +0100, Martin Liška wrote:
> As mentioned in the PR, we should not suggest option that is not allowed.
> Fixed by explicit removal of suggestions that are not acceptable.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?

Wouldn't it be better not to register those? Like (untested):

--- gcc/gcc.c.jj2016-11-14 19:57:10.0 +0100
+++ gcc/gcc.c   2016-12-21 10:58:29.739873850 +0100
@@ -7733,6 +7733,17 @@ driver::build_option_suggestions (void)
  {
for (int j = 0; sanitizer_opts[j].name != NULL; ++j)
  {
+   struct cl_option optb;
+   /* -fsanitize=all is not valid, only -fno-sanitize=all.
+  So don't register the positive misspelling candidates
+  for it.  */
+   if (sanitizer_opts[j].flag == ~0U && i == OPT_fsanitize_)
+ {
+   optb = *option;
+   optb.opt_text = opt_text = "-fno-sanitize=";
+   optb.cl_reject_negative = true;
+   option = 
+ }
/* Get one arg at a time e.g. "-fsanitize=address".  */
char *with_arg = concat (opt_text,
 sanitizer_opts[j].name,


Jakub


Re: [PATCH v4] Run tests only if the machine supports the instruction set.

2016-12-21 Thread Dominik Vogt
On Tue, Dec 20, 2016 at 10:32:26PM +0100, Jakub Jelinek wrote:
> On Tue, Dec 20, 2016 at 10:26:13PM +0100, Dominik Vogt wrote:
> > On Tue, Dec 20, 2016 at 11:57:52AM -0800, Mike Stump wrote:
> > > On Dec 20, 2016, at 6:10 AM, Dominik Vogt  wrote:
> > > > Right, it gets called even more often than one would think, and
> > > > even with empty torture_current_options.  The attached new patch
> > > > (v3) removes -Ox options and superflous whitespace and caches that
> > > > between calls if it's not empty.  There's another, permanent cache
> > > > for calls without any flags.  With proper ordering of the torture
> > > > options, the test program is built only a couple of times.
> > > 
> > > Seems fine to me, but most other cases use the postfix _hw.  Any
> > > reason not use use _hw (and not _runable) on these?  If not,
> > > could you please use _hw instead.
> > 
> > No specific reason other than lack of imagination.  "s390_hw" is a
> > bit too generic in my eyes -> the new names are:
> > 
> > v4:
> > 
> >   * Renamed "s390_runnable" to "s390_useable_hw".
> >   * Renamed "z900_runnable" to "s390_z900_hw",
> > Renamed "z10_runnable" to "s390_z10_hw",
> > etc.
> 
> Grepping for _hw in target-supports.exp reveals that usually the
> effective target predicates are called _hw or _hw_available,
> __hw only if it is too ambiguous (e.g. alpha_max_hw or
> ppc_float128_hw_available).  So I think z900_hw, z10_hw etc. is good
> enough (as long as it does not clash with some other target isa name),
> s390_usable_hw or s390_hw_available is fine.

Okay.  We usually prefix everyting with "s390_" on S/390, so I'd
say we don't make an exception here - even if there are no
potential naming collisions.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



[libgfortran, committed] Include when using strncasecmp

2016-12-21 Thread FX
The POSIX function strncasecmp() is used in three libgfortran files. It is 
located in , but these files include  (singular vs. 
plural). Apparently most implementations (linux, darwin, …) are lenient and 
allow that, but not mingw32, causing PR 70311 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70311)

The attached patch, bootstrapped and regtested on x86_64-apple-darwin16.3.0, 
fixes the issue by including the proper header.
Committed as revision 243843.

FX



strings.diff
Description: Binary data


Re: [PATCH] Remove unused libgfortran functions

2016-12-21 Thread FX
I followed up with the following patch, committed as revision 243841, removing 
the old _gfortran_ftell and renaming the modern _gfortran_ftell2. It was 
bootstrapped and regtested on x86_64-apple-darwin16.3.0.

This is the last item from https://gcc.gnu.org/wiki/LibgfortranAbiCleanup 
 which I feel comfortable 
addressing.

FX



ftell.ChangeLog
Description: Binary data


ftell.diff
Description: Binary data


Re: [PATCH][tree-ssa-address] Use simplify_gen_binary in gen_addr_rtx

2016-12-21 Thread Kyrill Tkachov


On 20/12/16 17:30, Richard Biener wrote:

On December 20, 2016 5:01:19 PM GMT+01:00, Kyrill Tkachov 
 wrote:

Hi all,

The testcase in this patch generates bogus assembly for arm with -O1
-mfloat-abi=soft:
strdr4, [#0, r3]

This is due to non-canonical RTL being generated during expansion:
(set (mem:DI (plus:SI (const_int 0 [0])
   (reg/f:SI 153)) [0 MEM[symbol: a, index: _26, offset: 0B]+0 S8 A64])
 (reg:DI 154))

Note the (plus (const_int 0) (reg)). This is being generated in
gen_addr_rtx in tree-ssa-address.c
where it creates an explicit PLUS rtx through gen_rtx_PLUS, which
doesn't try to canonicalise its arguments
or simplify. The correct thing to do is to use simplify_gen_binary that
will handle all this properly.

But it has to match up the validity check which passes down exactly the same 
RTL(?)  Or does this stem from propagation simplifying a MEM after IVOPTs?


You mean TARGET_LEGITIMATE_ADDRESS_P? Yes, it gets passed on to that, but the 
arm implementation of that
doesn't try to handle non-canonical RTL (plus (const0_rtx) (reg) is not 
canonical).
Or do you mean some other check?

Thanks,
Kyrill


I didn't change the other gen_rtx_PLUS calls in this function as their
results is later used in XEXP operations
that seem to rely on a PLUS expression being explicitly produced, but
this particular call doesn't, so it's okay
to change it. With this patch the sane assembly is generated:
 strdr4, [r3]

Bootstrapped and tested on arm-none-linux-gnueabihf, x86_64,
aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-12-20  Kyrylo Tkachov  

* tree-ssa-address.c (gen_addr_rtx): Use simplify_gen_binary to add
 *addr to act_elem.

2016-12-20  Kyrylo Tkachov  

 * gcc.dg/20161219.c: New test.






[PATCH] Do not suggest -fsanitize=all (PR driver/78863).

2016-12-21 Thread Martin Liška
As mentioned in the PR, we should not suggest option that is not allowed.
Fixed by explicit removal of suggestions that are not acceptable.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 1a2d5614e9a0515659f50b457ef031c1f80f4a7c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 20 Dec 2016 12:16:02 +0100
Subject: [PATCH] Do not suggest -fsanitize=all (PR driver/78863).

gcc/ChangeLog:

2016-12-20  Martin Liska  

	PR driver/78863
	* gcc.c (driver::build_option_suggestions): Call
	remove_misspelling_candidate for -fsanitize=all.
	* opts-common.c (remove_misspelling_candidate): New function.
	* opts.h (remove_misspelling_candidate): Likewise.

gcc/testsuite/ChangeLog:

2016-12-20  Martin Liska  

	PR driver/78863
	* gcc.dg/spellcheck-options-13.c: New test.
---
 gcc/gcc.c|  4 
 gcc/opts-common.c| 17 +
 gcc/opts.h   |  2 ++
 gcc/testsuite/gcc.dg/spellcheck-options-13.c |  5 +
 4 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-options-13.c

diff --git a/gcc/gcc.c b/gcc/gcc.c
index f78acd68606..1240e8a176b 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -7748,6 +7748,10 @@ driver::build_option_suggestions (void)
 	  break;
 	}
 }
+
+  /* PR driver/78863: skip -fsanitize=all.  */
+  remove_misspelling_candidate (m_option_suggestions, "fsanitize=all");
+  remove_misspelling_candidate (m_option_suggestions, "-sanitize=all");
 }
 
 /* Helper function for driver::handle_unrecognized_options.
diff --git a/gcc/opts-common.c b/gcc/opts-common.c
index e9d1c20a1f3..d5d81de8a5f 100644
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -413,6 +413,23 @@ add_misspelling_candidates (auto_vec *candidates,
 }
 }
 
+/* Helper function for gcc.c's driver which removes OPT_TEXT from
+   list of CANDIDATES.  */
+
+void
+remove_misspelling_candidate (auto_vec *candidates,
+			  const char *opt_text)
+{
+  for (unsigned i = 0; i < candidates->length (); i++)
+{
+  if (strcmp ((*candidates)[i], opt_text) == 0)
+	{
+	  candidates->ordered_remove (i);
+	  return;
+	}
+}
+}
+
 /* Decode the switch beginning at ARGV for the language indicated by
LANG_MASK (including CL_COMMON and CL_TARGET if applicable), into
the structure *DECODED.  Returns the number of switches
diff --git a/gcc/opts.h b/gcc/opts.h
index b3e64353c8a..052aa54cee4 100644
--- a/gcc/opts.h
+++ b/gcc/opts.h
@@ -420,6 +420,8 @@ extern const struct sanitizer_opts_s
 extern void add_misspelling_candidates (auto_vec *candidates,
 	const struct cl_option *option,
 	const char *base_option);
+extern void remove_misspelling_candidate (auto_vec *candidates,
+	  const char *opt_text);
 extern const char *candidates_list_and_hint (const char *arg, char *,
 	 const auto_vec  &
 	 candidates);
diff --git a/gcc/testsuite/gcc.dg/spellcheck-options-13.c b/gcc/testsuite/gcc.dg/spellcheck-options-13.c
new file mode 100644
index 000..19b63af565b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/spellcheck-options-13.c
@@ -0,0 +1,5 @@
+/* PR driver/78863.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fsanitize" } */
+/* { dg-error "unrecognized command line option .-fsanitize..$" "" { target *-*-* } 0 } */
-- 
2.11.0



Re: [PATCH] PR 78867 Function returning string ICE with -flto

2016-12-21 Thread FX
> 2016-12-21  Janne Blomqvist  
> 
>   PR fortran/78867
>   * trans-expr.c (gfc_conv_procedure_call): Emit DECL_EXPR also for
>   non-pointer character results.
> 
> testsuite ChangeLog:
> 
> 2016-12-21  Janne Blomqvist  
> 
>   PR fortran/78867
>   * gfortran.dg/string_length_4.f90: New test.

OK



[PATCH] PR 78867 Function returning string ICE with -flto

2016-12-21 Thread Janne Blomqvist
The fix for PR 78757 was slightly too cautious, and covered only the
case of functions returning pointers to characters. By moving the
block above the if statement the DECL_EXPR is created also for
functions returning non-pointer characters.

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

fortran ChangeLog:

2016-12-21  Janne Blomqvist  

PR fortran/78867
* trans-expr.c (gfc_conv_procedure_call): Emit DECL_EXPR also for
non-pointer character results.

testsuite ChangeLog:

2016-12-21  Janne Blomqvist  

PR fortran/78867
* gfortran.dg/string_length_4.f90: New test.
---
 gcc/fortran/trans-expr.c  | 26 +-
 gcc/testsuite/gfortran.dg/string_length_4.f90 | 16 
 2 files changed, 29 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/string_length_4.f90

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 823c96a..6ebdc8b 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6002,6 +6002,19 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  type = gfc_get_character_type (ts.kind, ts.u.cl);
  type = build_pointer_type (type);
 
+ /* Emit a DECL_EXPR for the VLA type.  */
+ tmp = TREE_TYPE (type);
+ if (TYPE_SIZE (tmp)
+ && TREE_CODE (TYPE_SIZE (tmp)) != INTEGER_CST)
+   {
+ tmp = build_decl (input_location, TYPE_DECL, NULL_TREE, tmp);
+ DECL_ARTIFICIAL (tmp) = 1;
+ DECL_IGNORED_P (tmp) = 1;
+ tmp = fold_build1_loc (input_location, DECL_EXPR,
+TREE_TYPE (tmp), tmp);
+ gfc_add_expr_to_block (>pre, tmp);
+   }
+
  /* Return an address to a char[0:len-1]* temporary for
 character pointers.  */
  if ((!comp && (sym->attr.pointer || sym->attr.allocatable))
@@ -6009,19 +6022,6 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
{
  var = gfc_create_var (type, "pstr");
 
- /* Emit a DECL_EXPR for the VLA type.  */
- tmp = TREE_TYPE (type);
- if (TYPE_SIZE (tmp)
- && TREE_CODE (TYPE_SIZE (tmp)) != INTEGER_CST)
-   {
- tmp = build_decl (input_location, TYPE_DECL, NULL_TREE, tmp);
- DECL_ARTIFICIAL (tmp) = 1;
- DECL_IGNORED_P (tmp) = 1;
- tmp = fold_build1_loc (input_location, DECL_EXPR,
-TREE_TYPE (tmp), tmp);
- gfc_add_expr_to_block (>pre, tmp);
-   }
-
  if ((!comp && sym->attr.allocatable)
  || (comp && comp->attr.allocatable))
{
diff --git a/gcc/testsuite/gfortran.dg/string_length_4.f90 
b/gcc/testsuite/gfortran.dg/string_length_4.f90
new file mode 100644
index 000..759066b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/string_length_4.f90
@@ -0,0 +1,16 @@
+! { dg-do compile }
+! { dg-require-effective-target lto }
+! { dg-options "-flto" }
+! PR 78867, test case adapted from gfortran.dg/string_length_1.f90
+program pr78867
+  if (len(bar(2_8)) /= 2) call abort
+contains
+
+  function bar(i)
+integer(8), intent(in) :: i
+character(len=i) :: bar
+  
+bar = ""
+  end function bar
+
+end program pr78867
-- 
2.7.4



Re: [PATCH] Speed-up use-after-scope (re-writing to SSA) (version 2)

2016-12-21 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 12:26:41PM +0100, Martin Liška wrote:
> Ok, llvm folks are unwilling to accept the new API function, thus I've 
> decided to come up
> with approach suggested by Jakub. Briefly, when expanding ASAN_POISON 
> internal function,
> we create a new variable (with the same name as the original one). The 
> variable is poisoned
> at the location of the ASAN_POISON and all usages just call ASAN_CHECK that 
> would trigger
> use-after-scope run-time error. Situation where ASAN_POISON has a LHS is very 
> rare and
> is very likely to be a bug. Thus suggested not super-optimized approach 
> should not be
> problematic.

Do you have a testcase for the case where there is a write to the var after
poison that is then made non-addressable?  use-after-scope-9.c only covers
the read.

> I'm not sure about the introduction of 'create_var' function, maybe we would 
> need some
> refactoring. Thoughts?

It doesn't belong to gimple-expr.c and the name is way too generic, we have
many create var functions already.  And this one is very specialized.

> 2016-12-19  Martin Liska  
> 
>   * asan.c (asan_expand_poison_ifn): New function.
>   * asan.h (asan_expand_poison_ifn):  Likewise.

Too many spaces.

> +  tree shadow_var = create_var (TREE_TYPE (poisoned_var),
> + IDENTIFIER_POINTER (DECL_NAME (var_decl)));

For the shadow var creation, IMHO you should
1) use a hash table, once you add a shadow variable for a certain variable
   for the first time, reuse it for all the other cases; you can have many
   ASAN_POISON () calls for the same underlying variable
2) as I said, use just a function in sanopt.c for this,
   create_asan_shadow_var or whatever
3) I think you just want to do copy_node, plus roughly what
   copy_decl_for_dup_finish does (and set DECL_ARTIFICIAL and
   DECL_IGNORED_P) - except that you don't have copy_body_data
   so you can't use it directly (well, you could create copy_body_data
   just for that purpose and set src_fn and dst_fn to current_function_decl
   and the rest to NULL)

I'd really like to see the storing to poisoned var becoming non-addressable
in action (if it can ever happen, so it isn't just theoretical) to look at
what it does.

Jakub


Re: [Ada] Fix PR ada/78845

2016-12-21 Thread Arnaud Charlet
> >> 2016-12-17  Simon Wright  
> >> 
> >>PR ada/78845
> >>* a-ngrear.adb (Inverse): call Unit_Matrix with First_1 set to
> >>A'First(2)
> >>and vice versa.
> > 
> > Patch is OK, thanks.
> 
> The same problem exists in Ada.Numerics.Generic_Complex_Arrays
> (a-congar.adb). Should I update the PR and this patch?

Yes, please resend an updated patch.

Arno