[PATCH] Adapt a couple of scalar comparison match.pd optimizations for vector comparisons against uniform vectors (PR target/88152)

2018-11-28 Thread Jakub Jelinek
Hi!

The following patch adapts a couple of scalar comparison against INTEGER_CST
optimizations to vector comparison against uniform_vector_p VECTOR_CST.

The PR was specifically asking for the a > INT_MAX, a >= INT_MAX etc.
to (signed) a < 0, the first two hunks are prerequsites of that though
in order not to have to duplicate everything for the boundary values +/- 1.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-29  Jakub Jelinek  

PR target/88152
* match.pd: For lt/le/gt/ge against unifoprm vector VECTOR_CST,
perform similar simplifications like for scalar comparisons.

* g++.dg/tree-ssa/pr88152.C: New test.

--- gcc/match.pd.jj 2018-11-14 17:42:53.0 +0100
+++ gcc/match.pd2018-11-28 16:57:28.377978794 +0100
@@ -3109,14 +3109,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (simplify
   (cmp @0 INTEGER_CST@1)
   (if (tree_int_cst_sgn (@1) == -1)
-   (acmp @0 { wide_int_to_tree (TREE_TYPE (@1), wi::to_wide (@1) + 1); }
+   (acmp @0 { wide_int_to_tree (TREE_TYPE (@1), wi::to_wide (@1) + 1); })))
+ (simplify
+  (cmp @0 VECTOR_CST@1)
+  (with { tree cst = uniform_vector_p (@1); }
+   (if (cst && TREE_CODE (cst) == INTEGER_CST && tree_int_cst_sgn (cst) == -1)
+(acmp @0 { build_vector_from_val (TREE_TYPE (@1),
+ wide_int_to_tree (TREE_TYPE (cst),
+   wi::to_wide (cst)
+   + 1)); })
 (for cmp  (ge lt)
  acmp (gt le)
  (simplify
   (cmp @0 INTEGER_CST@1)
   (if (tree_int_cst_sgn (@1) == 1)
-   (acmp @0 { wide_int_to_tree (TREE_TYPE (@1), wi::to_wide (@1) - 1); }
-
+   (acmp @0 { wide_int_to_tree (TREE_TYPE (@1), wi::to_wide (@1) - 1); })))
+ (simplify
+  (cmp @0 VECTOR_CST@1)
+  (with { tree cst = uniform_vector_p (@1); }
+   (if (cst && TREE_CODE (cst) == INTEGER_CST && tree_int_cst_sgn (cst) == 1)
+(acmp @0 { build_vector_from_val (TREE_TYPE (@1),
+ wide_int_to_tree (TREE_TYPE (cst),
+   wi::to_wide (cst)
+   - 1)); })
 
 /* We can simplify a logical negation of a comparison to the
inverted comparison.  As we cannot compute an expression
@@ -3993,7 +4008,84 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(with { tree st = signed_type_for (arg1_type); }
 (if (cmp == LE_EXPR)
 (ge (convert:st @0) { build_zero_cst (st); })
-(lt (convert:st @0) { build_zero_cst (st); }))
+(lt (convert:st @0) { build_zero_cst (st); })
+ /* And the same for vector comparisons against uniform vector csts.  */
+ (simplify
+  (cmp (convert?@2 @0) VECTOR_CST@1)
+  (if (VECTOR_TYPE_P (TREE_TYPE (@1))
+   && INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (@1)))
+   && uniform_vector_p (@1)
+   && tree_nop_conversion_p (TREE_TYPE (@2), TREE_TYPE (@0)))
+   (with
+{
+  tree arg1_type = TREE_TYPE (TREE_TYPE (@1));
+  tree cst = uniform_vector_p (@1);
+  unsigned int prec = TYPE_PRECISION (arg1_type);
+  wide_int max = wi::max_value (arg1_type);
+  wide_int signed_max = wi::max_value (prec, SIGNED);
+  wide_int min = wi::min_value (arg1_type);
+}
+(switch
+ (if (wi::to_wide (cst) == max)
+  (switch
+   (if (cmp == GT_EXPR)
+   { constant_boolean_node (false, type); })
+   (if (cmp == GE_EXPR)
+   (eq @2 @1))
+   (if (cmp == LE_EXPR)
+   { constant_boolean_node (true, type); })
+   (if (cmp == LT_EXPR)
+   (ne @2 @1
+ (if (wi::to_wide (cst) == min)
+  (switch
+   (if (cmp == LT_EXPR)
+{ constant_boolean_node (false, type); })
+   (if (cmp == LE_EXPR)
+(eq @2 @1))
+   (if (cmp == GE_EXPR)
+{ constant_boolean_node (true, type); })
+   (if (cmp == GT_EXPR)
+(ne @2 @1
+ (if (wi::to_wide (cst) == max - 1)
+  (switch
+   (if (cmp == GT_EXPR)
+   (eq @2 { build_vector_from_val (type,
+   wide_int_to_tree (TREE_TYPE (cst),
+ wi::to_wide (cst)
+ + 1)); }))
+   (if (cmp == LE_EXPR)
+   (ne @2 { build_vector_from_val (type,
+   wide_int_to_tree (TREE_TYPE (cst),
+ wi::to_wide (cst)
+ + 1)); }
+ (if (wi::to_wide (cst) == min + 1)
+  (switch
+   (if (cmp == GE_EXPR)
+(ne @2 { build_vector_from_val (type,
+   wide_int_to_tree (TREE_TYPE (cst),
+ wi::to_wide (cst)
+ - 1)); }))
+   (if (cmp == 

[Bug c++/87750] [8/9 Regression] Failed compilation / parsing of template member call after 'using' declaration

2018-11-28 Thread aoliva at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87750

Alexandre Oliva  changed:

   What|Removed |Added

 CC||aoliva at gcc dot gnu.org

--- Comment #5 from Alexandre Oliva  ---
The problem is that, in the presence of a template-dependent USING_DECL,
overloaded functions are dropped from the bindings and from lookup results: we
get only the USING_DECL in the value binding, and from the absence of template
functions in an overload set, we decide the looked up name cannot be a template
function in cp_parser_template_name.

Re: [C++ PATCH] Fix clone_body (PR c++/86669)

2018-11-28 Thread Jakub Jelinek
On Wed, Nov 28, 2018 at 09:42:47AM +0100, Jakub Jelinek wrote:
> Bootstrapped/regtested on x86_64-linux and i686-linux (earlier version
> without the insert_decl_map call), ok for trunk if it passes another
> bootstrap/regtest in the current form?

FYI, bootstrapped/regtested successfully on both.

> 2018-11-28  Jakub Jelinek  
> 
>   PR c++/86669
>   * optimize.c (clone_body_copy_decl): New function.
>   (clone_body): Use it for base cdtors.  Remap here only
>   DECL_INITIAL of decls that don't have DECL_CONTEXT of the
>   new clone.
> 
>   * g++.dg/cpp0x/initlist105.C: New test.
>   * g++.dg/cpp0x/initlist106.C: New test.
>   * g++.dg/other/pr86669.C: New test.

Jakub


[Bug c++/87539] [8/9 Regression] internal compiler error when compiling project with Os optimization flag

2018-11-28 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87539

--- Comment #4 from Jakub Jelinek  ---
Author: jakub
Date: Thu Nov 29 07:42:52 2018
New Revision: 266611

URL: https://gcc.gnu.org/viewcvs?rev=266611=gcc=rev
Log:
PR c++/87539
* g++.dg/cpp0x/pr87539.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/cpp0x/pr87539.C
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug ipa/88256] New: [7/8/9 Regression] ICE: Segmentation fault (in make_ssa_name_fn)

2018-11-28 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88256

Bug ID: 88256
   Summary: [7/8/9 Regression] ICE: Segmentation fault (in
make_ssa_name_fn)
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

g++-9.0.0-alpha20181125 snapshot (r266441), 8.2, 7.3 ICE when compiling
gcc/testsuite/gcc.c-torture/compile/pr84305.c at any optimization level w/
-fipa-cp:

% g++-9.0.0-alpha20181125 -O1 -fipa-cp -c
gcc/testsuite/gcc.c-torture/compile/pr84305.c
during IPA pass: materialize-all-clones
gcc/testsuite/gcc.c-torture/compile/pr84305.c: In function 'void f2(int)':
gcc/testsuite/gcc.c-torture/compile/pr84305.c:4:26: internal compiler error:
Segmentation fault
4 | void f1 (void) { f2 (a); }
  |  ^
0xf07ebf crash_signal
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/toplev.c:326
0x11136c9 make_ssa_name_fn(function*, tree_node*, gimple*, unsigned int)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-ssanames.c:268
0xf82a04 make_ssa_name
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-ssanames.h:114
0xf82a04 remap_ssa_name
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:245
0xf87d97 copy_tree_body_r(tree_node**, int*, void*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:1118
0x11b5782 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*, tree_node*
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*,
hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree.c:11892
0x11b5ead walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*, tree_node*
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*,
hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree.c:12214
0x11b5ead walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*, tree_node*
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*,
hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree.c:12214
0xf820e5 remap_type_1
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:578
0xf81021 remap_type(tree_node*, copy_body_data*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:607
0xf8914b remap_gimple_op_r
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:999
0x11b5782 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*, tree_node*
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*,
hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree.c:11892
0x11b5ead walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*, tree_node*
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*,
hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree.c:12214
0x11b5ead walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*, tree_node*
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*,
hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree.c:12214
0xc8ca3d walk_gimple_op(gimple*, tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/gimple-walk.c:202
0xf83541 remap_gimple_stmt
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:1778
0xf84536 copy_bb
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:1828
0xf85ddd copy_cfg_body
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:2720
0xf85ddd copy_body
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:2962
0xf8e3ea tree_function_versioning(tree_node*, tree_node*, vec*, bool, bitmap_head*, bool, bitmap_head*, basic_block_def*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/tree-inline.c:5994

[Bug c++/88258] New: [9 Regression] Infinite loop emitting diagnostics in the C++ front-end

2018-11-28 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88258

Bug ID: 88258
   Summary: [9 Regression] Infinite loop emitting diagnostics in
the C++ front-end
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

g++-9.0.0-alpha20181125 snapshot (r266441) enters infinite loop when compiling
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c w/ -fopenmp:

% timeout 0.0072 g++-9.0.0-alpha20181125 -fopenmp -c
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c: In function 'void f1()':
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:8:3: error: '_Atomic' was not declared in
this scope
8 |   _Atomic int i = 0, k[4];
  |   ^~~
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:10:3: error: 'k' was not declared in this
scope
   10 |   k[0] = 0;
  |   ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:14:37: error: 'i' has not been declared
   14 |   #pragma omp parallel reduction (+:i)  /* { dg-error "'_Atomic' 'i' in
'reduction' clause" } */
  | ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:15:5: error: 'i' was not declared in this
scope
   15 | i++;
  | ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:16:39: error: '_Atomic' does not name a
type
   16 |   #pragma omp declare reduction (foo: _Atomic int: omp_out += omp_in)
initializer (omp_priv = omp_orig * 0) /* { dg-error "'_Atomic' qualified type
in '#pragma omp declare reduction'" } */
  |   ^~~
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:16:46: error: expected ':' before 'int'
   16 |   #pragma omp declare reduction (foo: _Atomic int: omp_out += omp_in)
initializer (omp_priv = omp_orig * 0) /* { dg-error "'_Atomic' qualified type
in '#pragma omp declare reduction'" } */
  |  ^~~~
  |  :
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:20:39: error: 'i' has not been declared
   20 |   #pragma omp parallel reduction (bar:i) /* { dg-error "'_Atomic' 'i'
in 'reduction' clause" } */
  |   ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:21:5: error: 'i' was not declared in this
scope
   21 | i++;
  | ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c: At global scope:
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:29:18: error: expected ',' or '...'
before 'p'
   29 | f2 (int *_Atomic p)
  |  ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c: In function 'void f2(int*)':
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:31:29: error: 'p' has not been declared
   31 |   #pragma omp simd aligned (p : 16)  /* { dg-error "'_Atomic' 'p' in
'aligned' clause" } */
  | ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:33:5: error: 'p' was not declared in this
scope
   33 | p[i]++;
  | ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c: At global scope:
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:36:1: error: '_Atomic' does not name a
type
   36 | _Atomic int x;
  | ^~~
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:39:13: error: variable or field 'f3'
declared void
   39 | f3 (_Atomic int *p)
  | ^~~
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:39:5: error: '_Atomic' was not declared
in this scope
   39 | f3 (_Atomic int *p)
  | ^~~
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:42:27: error: expected unqualified-id
before end of line
   42 |   #pragma omp atomic write
  |   ^
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:42:27: error: expected unqualified-id
before end of line
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:42:27: error: expected unqualified-id
before end of line
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:42:27: error: expected unqualified-id
before end of line
gcc/testsuite/gcc.dg/gomp/_Atomic-3.c:42:27: error: expected unqualified-id
before end of line

<…>

Program received signal SIGINT, Interrupt.
0x77ada988 in write () from /lib64/libc.so.6
(gdb) where
#0  0x77ada988 in write () from /lib64/libc.so.6
#1  0x77a70d0d in _IO_file_write () from /lib64/libc.so.6
#2  0x77a6ff83 in new_do_write () from /lib64/libc.so.6
#3  0x77a7154e in _IO_file_xsputn () from /lib64/libc.so.6
#4  0x77a63c15 in fputs () from /lib64/libc.so.6
#5  0x0183d0e4 in pp_write_text_to_stream (pp=pp@entry=0x247dbc0)
at
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/pretty-print.c:840
#6  0x0183d169 in pp_flush (pp=0x247dbc0)
at
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/pretty-print.c:1467
#7  pp_flush (pp=0x247dbc0) at
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/pretty-print.c:1462
#8  0x01829e8a in diagnostic_report_diagnostic 

[Bug preprocessor/88257] New: [9 Regression] ICE in linemap_position_for_line_and_column, at libcpp/line-map.c:842

2018-11-28 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88257

Bug ID: 88257
   Summary: [9 Regression] ICE in
linemap_position_for_line_and_column, at
libcpp/line-map.c:842
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

g++-9.0.0-alpha20181125 snapshot (r266441) ICEs when compiling
gcc/testsuite/gcc.dg/format/pr78304.c w/ -Wformat:

% g++-9.0.0-alpha20181125 -Wformat -c gcc/testsuite/gcc.dg/format/pr78304.c 
gcc/testsuite/gcc.dg/format/pr78304.c: In function 'void test(const char*)':
gcc/testsuite/gcc.dg/format/pr78304.c:9:37: internal compiler error: in
linemap_position_for_line_and_column, at libcpp/line-map.c:842
9 |   printf ("size: %" PRIu32 "\n", msg); /* { dg-warning "expects
argument of type" } */
  | ^
0x18631a5 linemap_position_for_line_and_column(line_maps*, line_map_ordinary
const*, unsigned int, unsigned int)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/libcpp/line-map.c:842
0x18631a5 linemap_position_for_line_and_column(line_maps*, line_map_ordinary
const*, unsigned int, unsigned int)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/libcpp/line-map.c:837
0x1844d06 get_substring_ranges_for_loc
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/input.c:1437
0x18455fb get_location_within_string(cpp_reader*, string_concat_db*, unsigned
int, cpp_ttype, int, int, int, unsigned int*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/input.c:1490
0xa17d26 c_get_substring_location(substring_loc const&, unsigned int*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-common.c:879
0xa4e2c5 get_corrected_substring
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:3503
0xa4e2c5 format_type_warning
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:3718
0xa4f855 check_format_types
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:3262
0xa4f855 argument_parser::check_argument_type(format_char_info const*,
length_modifier const&, tree_node*&, char const*&, bool, unsigned long&,
tree_node*&, int, char const*, char const*, unsigned int, char)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:2776
0xa52f70 argument_parser::check_argument_type(format_char_info const*,
length_modifier const&, tree_node*&, char const*&, bool, unsigned long&,
tree_node*&, int, char const*, char const*, unsigned int, char)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:2671
0xa52f70 check_format_info_main
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:2988
0xa52f70 check_format_arg
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:1740
0xa503ff check_format_info
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:1463
0xa503ff check_function_format(tree_node const*, tree_node*, int, tree_node**,
vec*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-format.c:1120
0xa41be1 check_function_arguments(unsigned int, tree_node const*, tree_node
const*, int, tree_node**, vec*)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/c-family/c-common.c:5675
0x823f64 build_over_call
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/cp/call.c:8263
0x8314ac build_new_function_call(tree_node*, vec**, int)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/cp/call.c:4407
0x9be30d finish_call_expr(tree_node*, vec**, bool,
bool, int)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/cp/semantics.c:2560
0x93a578 cp_parser_postfix_expression
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/cp/parser.c:7300
0x948609 cp_parser_unary_expression
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/cp/parser.c:8382

Re: RFA: a x86 test modification

2018-11-28 Thread Uros Bizjak
On Thu, Nov 29, 2018 at 12:43 AM Jeff Law  wrote:
>
> On 11/28/18 2:47 PM, Vladimir Makarov wrote:
> >   The patch I committed today recently for
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88207
> >
> >   creates a new regression for pr34256.c.  2 moves is expected but gcc
> > with the patch generates 3 moves.  I think now RA generates the right code.
> >
> > We have the following code before RA
> >
> > (insn 7 6 13 2 (set (reg:V2SI 88)
> > (plus:V2SI (reg:V2SI 89 [ x ])
> > (mem/c:V2SI (symbol_ref:DI ("y") [flags 0x2]   > 0x7faad08b5b40 y>) [1 y+0 S8 A64])))"mmintrin.h":306:18 1115 {*mmx_addv2si3}
> >
> >  (expr_list:REG_DEAD (reg:V2SI 89 [ x ])
> > (nil)))
> > (insn 13 7 14 2 (set (reg/i:DI 0 ax)
> > (subreg:DI (reg:V2SI 88) 0)) "pr34256.c":11:1 66 {*movdi_internal}
> >  (expr_list:REG_DEAD (reg:V2SI 88)
> >
> > The test is expected to assign mmx reg to pseudo 88 but gcc with the
> > patch assigns memory to it.  The cost of mmx to general reg move is 13,
> > while overall cost of mmx to mem and mem to general moves is 10.  So IRA
> > now chooses memory for pseudo 88 according to the minimal cost.
> >
> > Now, if we want still assign mmx reg to the pseudo 88, we should change
> > the costs in machine-dependent x86 code.  But I think it might create
> > other unexpected code generation.  As mmx is basically not used nowadays
> > the test is not important, I just propose the following patch.
> >
> > Is it ok for the trunk?

The generated code is correct, in i386.c, secondary_memory_needed, we have:

  /* ??? This is a lie.  We do have moves between mmx/general, and for
 mmx/sse2.  But by saying we need secondary memory we discourage the
 register allocator from using the mmx registers unless needed.  */
  if (MMX_CLASS_P (class1) != MMX_CLASS_P (class2))
return true;

But I noticed that for -mtune=generic we still generate direct move.
The cost of interunit MMX -> GR moves should always be higher than the
cost of indirect move (due to secondary_memory_needed), c.f. the
comment in ix86_register_move_cost:

  /* In case we require secondary memory, compute cost of the store followed
 by load.  In order to avoid bad register allocation choices, we need
 for this to be *at least* as high as the symmetric MEMORY_MOVE_COST.  */

IMO, the issue with -mtune=generic warrants some additional analysis.
I'll look at target cost calculations.

Thanks,
Uros.


[PATCH v5] Repeat jump threading after combine

2018-11-28 Thread Ilya Leoshkevich
Repost of v4: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02067.html

Bootstrapped and regtested on x86_64-redhat-linux, s390x-redhat-linux
and ppc64le-redhat-linux.

Consider the following RTL:

(insn (set (reg 65) (if_then_else (eq %cc 0) 1 0)))
(insn (parallel [(set %cc (compare (reg 65) 0)) (clobber %scratch)]))
(jump_insn (set %pc (if_then_else (ne %cc 0) (label_ref 23) %pc)))

Combine simplifies this into:

(note NOTE_INSN_DELETED)
(note NOTE_INSN_DELETED)
(jump_insn (set %pc (if_then_else (eq %cc 0) (label_ref 23) %pc)))

opening up the possibility to perform jump threading.

gcc/ChangeLog:

2018-09-19  Ilya Leoshkevich  

PR target/80080
* cfgcleanup.c (class pass_postreload_jump): New pass.
(pass_postreload_jump::execute): Likewise.
(make_pass_postreload_jump): Likewise.
* passes.def: Add pass_postreload_jump before
pass_postreload_cse.
* tree-pass.h (make_pass_postreload_jump): New pass.

gcc/testsuite/ChangeLog:

2018-09-05  Ilya Leoshkevich  

PR target/80080
* gcc.target/s390/pr80080-4.c: New test.
---
 gcc/cfgcleanup.c  | 42 +++
 gcc/passes.def|  1 +
 gcc/testsuite/gcc.target/s390/pr80080-4.c | 16 +
 gcc/tree-pass.h   |  1 +
 4 files changed, 60 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr80080-4.c

diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index 4a5dc29d14f..bc4a78889db 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -3259,6 +3259,48 @@ make_pass_jump (gcc::context *ctxt)
 
 namespace {
 
+const pass_data pass_data_postreload_jump =
+{
+  RTL_PASS, /* type */
+  "postreload_jump", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_JUMP, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_postreload_jump : public rtl_opt_pass
+{
+public:
+  pass_postreload_jump (gcc::context *ctxt)
+: rtl_opt_pass (pass_data_postreload_jump, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual unsigned int execute (function *);
+
+}; // class pass_postreload_jump
+
+unsigned int
+pass_postreload_jump::execute (function *)
+{
+  cleanup_cfg (flag_thread_jumps ? CLEANUP_THREADING : 0);
+  return 0;
+}
+
+} // anon namespace
+
+rtl_opt_pass *
+make_pass_postreload_jump (gcc::context *ctxt)
+{
+  return new pass_postreload_jump (ctxt);
+}
+
+namespace {
+
 const pass_data pass_data_jump2 =
 {
   RTL_PASS, /* type */
diff --git a/gcc/passes.def b/gcc/passes.def
index 82ad9404b9e..0079fecef32 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -458,6 +458,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_reload);
   NEXT_PASS (pass_postreload);
   PUSH_INSERT_PASSES_WITHIN (pass_postreload)
+ NEXT_PASS (pass_postreload_jump);
  NEXT_PASS (pass_postreload_cse);
  NEXT_PASS (pass_gcse2);
  NEXT_PASS (pass_split_after_reload);
diff --git a/gcc/testsuite/gcc.target/s390/pr80080-4.c 
b/gcc/testsuite/gcc.target/s390/pr80080-4.c
new file mode 100644
index 000..5fc6a558008
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/pr80080-4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { lp64 } } } */
+/* { dg-options "-march=z196 -O2" } */
+
+extern void bar(int *mem);
+
+void foo4(int *mem)
+{
+  int oldval = 0;
+  if (!__atomic_compare_exchange_n (mem, (void *) , 1,
+   1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
+{
+  bar (mem);
+}
+}
+
+/* { dg-final { scan-assembler 
{(?n)\n\tlt\t.*\n\tjne\t(\.L\d+)\n(.*\n)*\tcs\t.*\n\tber\t%r14\n\1:\n\tjg\tbar\n}
 } } */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 2f8779ee4b8..b20d34c15e9 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -579,6 +579,7 @@ extern rtl_opt_pass *make_pass_clean_state (gcc::context 
*ctxt);
 extern rtl_opt_pass *make_pass_branch_prob (gcc::context *ctxt);
 extern rtl_opt_pass *make_pass_value_profile_transformations (gcc::context
  *ctxt);
+extern rtl_opt_pass *make_pass_postreload_jump (gcc::context *ctxt);
 extern rtl_opt_pass *make_pass_postreload_cse (gcc::context *ctxt);
 extern rtl_opt_pass *make_pass_gcse2 (gcc::context *ctxt);
 extern rtl_opt_pass *make_pass_split_after_reload (gcc::context *ctxt);
-- 
2.19.1



Re: [PATCH 5/6] [RS6000] Use standard call patterns for __tls_get_addr calls

2018-11-28 Thread Alan Modra
On Wed, Nov 28, 2018 at 07:32:50AM -0600, Segher Boessenkool wrote:
> On Wed, Nov 28, 2018 at 11:37:42AM +1030, Alan Modra wrote:
> > On Tue, Nov 27, 2018 at 10:29:29AM -0600, Segher Boessenkool wrote:
> > > Looks fine.  Thank you for the cleanup!  Okay for trunk, but please do the
> > > extra testing.
> > 
> > Huh, local testing of -mno-tls-markers showed a lack of a
> > TARGET_TLS_MARKERS check in rs6000_call_template_1.  Likely this would
> > blow up on AIX.  I'll test with the following delta.
> 
> Sounds good.

For the record, this is the patch I committed.  Besides the delta
posted previously, I managed to combine the tls_gdld_aix and
tls_gdld_sysv insns into a single insn, tls_gdld_nomark.

* config/rs6000/predicates.md (unspec_tls): New.
* config/rs6000/rs6000-protos.h (rs6000_call_template),
(rs6000_sibcall_template): Update prototype.
(rs6000_longcall_ref): Delete.
(rs6000_call_sysv): Declare.
* config/rs6000/rs6000.c (edit_tls_call_insn): New function.
(global_tlsarg): New variable.
(rs6000_legitimize_tls_address): Rewrite __tls_get_addr call
handling.
(print_operand): Extract UNSPEC_TLSGD address operand.
(rs6000_call_template, rs6000_sibcall_template): Remove arg
parameter, extract from second call operand instead.
(rs6000_longcall_ref): Make static, localize vars.
(rs6000_call_aix): Rename parameter to reflect new usage.  Take
tlsarg from global_tlsarg.  Don't create unused rtl or nop insns.
(rs6000_sibcall_aix): Rename parameter to reflect new usage.  Take
tlsarg from global_tlsarg.
(rs6000_call_sysv): New function.
* config/rs6000/rs6000.md: Adjust rs6000_call_template and
rs6000_sibcall_template throughout.
(tls_gd_aix, tls_gd_sysv, tls_gd_call_aix, tls_gd_call_sysv): Delete.
(tls_ld_aix, tls_ld_sysv, tls_ld_call_aix, tls_ld_call_sysv): Delete.
(tls_gdld_nomark): New insn.
(tls_gd): Swap operand order.  Simplify mode selection.
(tls_gd_high, tls_gd_low): Swap operand order.
(tls_ld): Remove const_int 0 vector element from UNSPEC_TLSLD.
Simplify mode selection.
(tls_ld_high, tls_ld_low): Similarly adjust UNSPEC_TLSLD.
(call, call_value): Don't assert for second call operand.
Use rs6000_call_sysv.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 5589ea19519..2c297fc45e8 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -997,6 +997,13 @@ (define_predicate "rs6000_tls_symbol_ref"
   (and (match_code "symbol_ref")
(match_test "RS6000_SYMBOL_REF_TLS_P (op)")))
 
+;; Return 1 for the UNSPEC used in TLS call operands
+(define_predicate "unspec_tls"
+  (match_code "unspec")
+{
+  return XINT (op, 1) == UNSPEC_TLSGD || XINT (op, 1) == UNSPEC_TLSLD;
+})
+
 ;; Return 1 if the operand, used inside a MEM, is a valid first argument
 ;; to CALL.  This is a SYMBOL_REF, a pseudo-register, LR or CTR.
 (define_predicate "call_operand"
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index f1a294a3617..dd930bb2da6 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -105,8 +105,8 @@ extern int ccr_bit (rtx, int);
 extern void rs6000_output_function_entry (FILE *, const char *);
 extern void print_operand (FILE *, rtx, int);
 extern void print_operand_address (FILE *, rtx);
-extern const char *rs6000_call_template (rtx *, unsigned int, const char *);
-extern const char *rs6000_sibcall_template (rtx *, unsigned int, const char *);
+extern const char *rs6000_call_template (rtx *, unsigned int);
+extern const char *rs6000_sibcall_template (rtx *, unsigned int);
 extern const char *rs6000_indirect_call_template (rtx *, unsigned int);
 extern const char *rs6000_indirect_sibcall_template (rtx *, unsigned int);
 extern enum rtx_code rs6000_reverse_condition (machine_mode,
@@ -130,7 +130,6 @@ extern void rs6000_expand_atomic_op (enum rtx_code, rtx, 
rtx, rtx, rtx, rtx);
 extern void rs6000_emit_swdiv (rtx, rtx, rtx, bool);
 extern void rs6000_emit_swsqrt (rtx, rtx, bool);
 extern void output_toc (FILE *, rtx, int, machine_mode);
-extern rtx rs6000_longcall_ref (rtx);
 extern void rs6000_fatal_bad_address (rtx);
 extern rtx create_TOC_reference (rtx, rtx);
 extern void rs6000_split_multireg_move (rtx, rtx);
@@ -198,6 +197,7 @@ extern void rs6000_split_stack_space_check (rtx, rtx);
 extern void rs6000_emit_eh_reg_restore (rtx, rtx);
 extern void rs6000_call_aix (rtx, rtx, rtx, rtx);
 extern void rs6000_sibcall_aix (rtx, rtx, rtx, rtx);
+extern void rs6000_call_sysv (rtx, rtx, rtx, rtx);
 extern void rs6000_aix_asm_output_dwarf_table_ref (char *);
 extern void get_ppc476_thunk_name (char name[32]);
 extern bool rs6000_overloaded_builtin_p (enum rs6000_builtins);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 

Re: Fix hashtable node deallocation

2018-11-28 Thread François Dumont

I am unclear about this patch, is it accepted ?


On 11/19/18 10:19 PM, François Dumont wrote:

On 11/19/18 1:34 PM, Jonathan Wakely wrote:

On 10/11/18 22:40 +0100, François Dumont wrote:
While working on a hashtable enhancement I noticed that we are not 
using the correct method to deallocate node if the constructor 
throws in _ReuseOrAllocNode operator(). I had to introduce a new 
_M_deallocate_node_ptr for that as node value shall not be destroy 
again.


I also check other places and noticed that a __node_type destructor 
call was missing.


That's intentional. The type has a trivial destructor, so its storage
can just be reused, we don't need to destroy it.



Ok, do you want to also remove the other call to ~__node_type() then ?

Here is the updated patch and the right ChangeLog entry:

    * include/bits/hashtable_policy.h
(_Hashtable_alloc<>::_M_deallocate_node_ptr(__node_type*)): New.
    (_Hashtable_alloc<>::_M_deallocate_node(__node_type*)): Use latter.
(_ReuseOrAllocNode<>::operator<_Arg>()(_Arg&&)): Likewise.
    (_Hashtable_alloc<>::_M_allocate_node): Add ~__node_type call.
    * libstdc++-v3/testsuite/util/testsuite_allocator.h
    (CustomPointerAlloc<>::allocate(size_t, pointer)): Replace by...
    (CustomPointerAlloc<>::allocate(size_t, const_void_pointer)): 
...this.

    * testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Add
    check.

Ok to commit ?

François





[Bug target/88255] New: Thumb-1: GCC too aggressive on mul->lsl/sub/add optimization

2018-11-28 Thread husseydevin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88255

Bug ID: 88255
   Summary: Thumb-1: GCC too aggressive on mul->lsl/sub/add
optimization
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: husseydevin at gmail dot com
  Target Milestone: ---

I might be wrong, but it appears that GCC is too aggressive in its conversion
from multiplication to shift+add when targeting Thumb-1

It is true that, for example, the Cortex-M0 can have the small multiplier and a
16 cycle shift sequence would be faster. However, I was targeting arm7tdmi
(-march=armv4t -mthumb -O3 -mtune=arm7tdmi) which, if I am not mistaken, uses
one cycle for every 8 bits.

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0234b/i102180.html

However, looking in the source code, I notice that the loop is dividing by 4. I
think it might be a bug that is causing the otherwise 7 (I think) cycle
sequence in the code below to be considered as having a weight of 18 cycles.

https://github.com/gcc-mirror/gcc/blob/master/gcc/config/arm/arm.c#L8959

I could be wrong, but one of the things I noticed is that very old versions of
GCC (2.95) will not perform this many shifts, and that Clang, when given the 
transpiled output in C and targeted for the same platform, will actually
convert it back into a ldr/mul.

However, when targeting cortex-m0plus.small-multiply, it will still turn it
into multiplication.

Code example: 

  unsigned MultiplyByPrime(unsigned val)
  {
  return val * 2246822519U;
  }

  MultiplyByPrime:
 lslsr3, r0, #7 @ unsigned ret = val << 7;
 subsr3, r3, r0 @ ret -= val;
 lslsr3, r3, #5 @ ret <<= 5;
 subsr3, r3, r0 @ ret -= val;
 lslsr3, r3, #2 @ ret <<= 2;
 addsr3, r3, r0 @ ret += val;
 lslsr2, r3, #3 @ unsigned tmp = ret << 3;
 addsr3, r3, r2 @ ret += tmp;
 lslsr3, r3, #1 @ ret <<= 1;
 addsr3, r3, r0 @ ret += val;
 lslsr3, r3, #6 @ ret <<= 6;
 addsr3, r3, r0 @ ret += val;
 lslsr2, r3, #4 @ tmp = ret << 4;
 subsr3, r2, r3 @ ret = tmp - ret;
 lslsr3, r3, #3 @ ret <<= 3;
 subsr0, r3, r0 @ ret -= val;
 bx  lr @ return ret;

Re: [PATCH] make function_args_iterator a proper iterator

2018-11-28 Thread Jeff Law
On 11/20/18 8:28 AM, Martin Sebor wrote:
> On 11/20/2018 02:21 AM, Richard Biener wrote:
>> On Mon, Nov 19, 2018 at 4:36 PM Martin Sebor  wrote:
>>>
>>> On 11/19/2018 03:32 AM, Richard Biener wrote:
 On Sat, Nov 17, 2018 at 12:05 AM Martin Sebor  wrote:
>
> To encourage and simplify the adoption of iterator classes in
> GCC the attached patch turns the function_args_iterator struct
> into an (almost) proper C++ iterator class that can be used
> the same way as traditional forward iterators.
>
> The patch also replaces all of the 26 uses of the legacy
> FOREACH_FUNCTION_ARGS macro with ordinary for loops that use
> function_args_iterator directly, and also poisons both
> FOREACH_FUNCTION_ARGS and the unused FOREACH_FUNCTION_ARGS_PTR
> macros.
>
> The few dozen (hundred?) existing uses of for loops that iterate
> over function parameter types using the TREE_CHAIN() macro can
> be relatively easily modified to adopt the iterator approach over
> time.  (The patch stops of short of making this change.)
>
> Eventually, when GCC moves to more a recent C++ revision, it will
> become possible to simplify the for loops to make use of the range
> based for loop syntax along the lines of:
>
>    for (auto argtype: function_args (functype))
>  {
>    ...
>  }
>
> Tested on x86_64-linux, and (lightly) on powerpc64le-linux using
> a cross-compiler.  I'll test the changes to the other back ends
> before committing.

 This isn't stage3 material.
>>>
>>> In the response referenced below Jeff requested I make use of
>>> iterators in my patch.  This simply does what he asked for,
>>> except throughout all of GCC.
>>
>> I don't think he said you should invent new iterators - we have
>> existing ones.
> 
> The patch doesn't add a new iterator: it makes the existing
> function_args_iterator a proper iterator class with the expected
> iterator members like increment and equality operator, to make
> it usable in contexts where other iterators (and pointers) are
> expected.
The way to go would have been to just use the existing iterator in your
patch and queue a gcc-10 change to a proper iterator.


Re: [PATCH v4] Add sinh(atanh(x)) and cosh(atanh(x)) optimizations

2018-11-28 Thread Jeff Law
On 11/27/18 12:38 PM, Giuliano Augusto Faulin Belinassi wrote:
> Only do this optimization if funsafe-math and -fno-math-errno are
> enabled, as pointed in the previous iteration.
> 
> Also added one more test case to ensure that fno-math-errno is
> required for the optimization.
> 
> Special thanks for Wilco Dijsktra for all his help :-)
> 
> gcc/ChangeLog
> 2018-11-27  Giuliano Belinassi  
> 
> * match.pd (sinh (atanh (x))): New simplification rules.
> (cosh (atanh (x))): Likewise.
> 
> gcc/testsuite/ChangeLog
> 2018-11-27  Giuliano Belinassi  
> 
> * gcc.dg/sinhatanh-1.c: New test.
> * gcc.dg/sinhatanh-2.c: New test.
> * gcc.dg/sinhatanh-3.c: New test.
> 
> 
> sinhatanhv4.patch
> 
> Index: gcc/match.pd
> ===
> --- gcc/match.pd  (revision 266469)
> +++ gcc/match.pd  (working copy)
> @@ -4342,6 +4342,25 @@
>(rdiv { t_one; } (sqrts (plus (mult @0 @0) { t_one; })))
>(copysigns { t_zero; } @0))
>  
> + (if (!flag_errno_math)
You're missing the flag_unsafe_math_optimizations check AFAICT.  Right?

jeff


Re: [PATCH 1/4] introduce struct strlen_data_t into gimple-fold

2018-11-28 Thread Jeff Law
On 11/25/18 5:05 PM, Martin Sebor wrote:
> 
>> If so, then I think we need
>> to look for a better name than MAXSIZE and MAXLEN.
> 
> I find these names quite fitting and I'm not sure what might work
> better.  I renamed MAXSIZE to MAXBOUND but nothing comes to mind
> as a replacement for MAXLEN.  Please suggest something you think
> is better.
> 
>>
>>> +  /* When non-null, NONSTR refers to the declaration known to store
>>> + an unterminated constant character array, as in:
>>> + const char s[] = { 'a', 'b', 'c' };
>>> + It is used to diagnose uses of such arrays in functions such as
>>> + strlen() that expect a nul-terminated string as an argument.  */
>>> +  tree nonstr;
>> So rather than NONSTR, DECL may make more sense -- if for no other
>> reason than you don't have to think in terms of "not a string".
> 
> Done, but I think DECL is a poor choice for the reasons below.
> 
> The field is only set when the thing the object refers to is
> a character array that is not a string.  It identifies the first
> array the expression refers to that's not a terminated string
> (there could be multiple).  I can't think of anything else one
> might want to think of it as than "a declaration of an array
> that is not a string."
> 
> As a name, DECL is generic and used all over the place for any
> sort of a declaration so it's not a good choice also for that
> reason.  It's only marginally more descriptive that the pervasive
> NODE or T, but just as useless to grep for (which I have been
> relying on when working with this patch).
> 
> I have been using the name NONSTR in all contexts where
> I introduced the unterminated array handling, so renaming
> the member to DECL makes this scheme inconsistent
NONSTR requires you to think in the negative and it sounds more like a
boolean property to me, but it's actually carrying more information than
just a boolean.

I'm certainly not wed to DECL.  If you've got a better name, please
suggest one.


> 
>>> +  /* ELTSIZE is set to 1 for single byte character strings, and 2 or 4
>>> + for wide characer strings.  */
>>> +  unsigned eltsize;
>> Bernd's suggestion that we separate the input vs output paramters may be
>> a reasonable one -- I think this is the only in-parameter you're passing
>> with the structure, right?  And everything else is a pure output?  If so
>> we may be better off continuing to pass the element size as a separate
>> parameter.
> 
> I changed it in the updated patch.  I had chosen to make it
> a member to reduce the number of arguments to these functions and
> in anticipation of having them update it before returning if they
> discover that the actual element size doesn't match the expected
> size, as in:
> 
>   printf ("%ls", "narrow string");
> 
> Similarly to what I proposed here:
>   https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01321.html
> 
> I don't see what has been gained by making it an argument again.
It's the separation of inputs from outputs he was trying to achieve that
I said *may* be a reasonable one.

It's not always a good thing, nor is it always a bad thing.  If there
are subsequent patches are going to have callees changing it, then it
absolutely makes sense to include it.  But adding it to the structure
for the mere sake of reducing arguments may not.

And just to be clear, I'm not inherently against pulling stuff into
structures and classes.   Quite the opposite.

> 
>>> +  /* FLEXARRAY is true if the range of the string lengths has been
>>> + obtained from the upper bound of an array at the end of a struct.
>>> + Such an array may hold a string that's longer than its upper bound
>>> + due to it being used as a poor-man's flexible array member.  */
>>> +  bool flexarray;
>>> +
>>> +  /* Set ELTSIZE and value-initialize all other members.  */
>>> +  strlen_data_t (unsigned eltbytes)
>>> +    : minlen (), maxlen (), maxsize (), nonstr (), eltsize (eltbytes),
>>> +  flexarray () { }
>> I think if you pull ELTSIZE out and pass it as a distinct parameter,
>> then you don't need a ctor and you can have a POD.  You can then
>> initialize with memset rather than having to individually initialize
>> each field -- meaning it's easier to add more output fields in the
>> future.
> 
> Without ELTSIZE neither a ctor nor memset is necessary for
> initialization.  This works too and is the preferred style
> in C++ 98:
> 
>   c_strlen_data data = { };
I thought that didn't work somewhere.  I certainly would have preferred
that over memset when I was twiddling yours and Bernd's earlier patches.



> 
>>
>> I don't think any of the suggestions above change the behavior of the
>> patch.  Let's hold off committing though (I assume you've got a GIT
>> topic branch where you can make these changes and update the subsequent
>> patches independent of everything else...)
> 
> I do but I don't know how to make these changes without having
> to resolve all the conflicts with the intervening changes on
> trunk at each step so 

Re: question about attribute aligned for functions

2018-11-28 Thread Jeff Law
On 11/27/18 11:57 AM, Martin Sebor wrote:
> On 11/26/18 3:37 PM, Jeff Law wrote:
>> On 11/23/18 12:31 PM, Martin Sebor wrote:
>>> GCC currently accepts the declaration of f0 below but ignores
>>> the attribute.  On aarch64 (and I presume on other targets with
>>> a default function alignment greater than 1), GCC rejects f1
>>> with an error, even though it accepts -falign-functions=1
>>> without as much as a warning.
>>>
>>> Clang, on the other hand, rejects f0 with a hard error because
>>> the alignment is not a power of two, but accepts f1 and appears
>>> to honor the attribute.  It also accepts -falign-functions=1.
>>>
>>> I think diagnosing f0 with a warning is helpful because an explicit
>>> zero alignment is most likely a mistake (especially when it comes
>>> from a macro or some computation).
>>>
>>> But I don't see a good reason to reject a program that specifies
>>> a smaller alignment for a function when the default (or minimum)
>>> alignment is greater.  A smaller alignment is trivially satisfied
>>> by a greater alignment so either accepting it or dropping it seems
>>> preferable to failing with an error (either could be with or without
>>> a warning).
>>>
>>>    __attribute__ ((aligned (0))) void f0 (void);   // accepted, ignored
>>>    __attribute__ ((aligned (1))) void f1 (void);   // aarch64 error
>>>    __attribute__ ((aligned (4))) void f4 (void);   // okay
>>>
>>> Does anyone know why GCC rejects the program, or can anyone think
>>> of a reason why GCC should not behave as suggested above?
>> Note we have targets that support single byte opcodes and do not have
>> any requirements for functions starting on any boundary.  mn103 comes to
>> mind.
>>
>> However, the attribute can't be used to decrease a function's alignment,
>> so values of 0 or 1 are in practice totally uninteresting and one could
>> make an argument to warn for them.
> 
> The attribute does reduce the default alignment at least on some
> targets.  For instance, on x86_64 it lowers it from the default
> 16 to as little as 2, but it silently ignores 1.
[ ... ]
You cannot use this attribute to decrease the alignment of a function,
only to increase it.  However, when you explicitly specify a function
alignment this overrides the effect of the
@option{-falign-functions} (@pxref{Optimize Options}) option for this
function.
[ ... ]

My reading of that would be that I would get an error/warning if I even
specified an alignment attribute which decreased the alignment.

If it instead said something like

You can not rely on this attribute to decrease ...

Then current behavior (where apparently you can decrease the alignment
on some ports) makes much more sense.

I guess it ultimately depends on how one interprets that tidbit from the
docs.

>>
>> Whether or not to warn in general if the alignment attribute is smaller
>> than the default may be subject to debate.  I guess it depends on the
>> general intent that we'd find in real world codes.
> 
> I would expect real world code to care about achieving at least
> the specified alignment.
If we only allow increasing, yes.  At which point what do the values 0
or 1 realistically mean?

If we allow decreasing, then the user may be asking for a smaller
alignment to achieve better code density.


> 
> A less restrictive alignment requirement is satisfied by providing
> a more restrictive one, and (the above notwithstanding) the manual
> documents that
> 
>   You cannot use this attribute to decrease the alignment of
>   a function, only to increase it.
> 
> So I wouldn't expect real code to be relying on decreasing
> the alignment.
> 
> That said, since GCC does make it possible to decrease the default
> alignment of functions, I can think of no reason for it not to
> continue when it's possible.  I.e., on targets with underaligned
> instruction reads to be honor requests for underaligned functions.
> On strictly aligned targets I think the safe thing to do is to
> quietly ignore requests for smaller alignments by default, and
> perhaps provide a new option to request a warning (with the warning
> being disabled by default).
> 
> Do you see a problem with this approach?
ISTM we need to figure out whether or not we consider an attempt to
lower the alignment as invalid vs we'll try, but no guarantees.

jeff


Re: [PATCH] be more permissive about function alignments (PR 88208)

2018-11-28 Thread Jeff Law
On 11/27/18 9:32 PM, Martin Sebor wrote:
> The tests for the new __builtin_has_attribute function have been
> failing on a number of targets because of a couple of assumptions
> that only hold on some.
> 
> First, they expect that it's safe to apply attribute aligned with
> a smaller alignment than the target provides when GCC rejects such
> arguments.  The tests pass on i86 and elsewhere but fail on
> strictly aligned targets like aarch64 or sparc.  After some testing
> and thinking I don't think this is helpful -- I believe it's better
> to instead silently accept attributes that ask for a less restrictive
> alignment than the function ultimately ends up with (see * below).
> This is what testing shows Clang does on those targets.  The attached
> patch implements this change.
> 
> Second, the tests assume that the priority forms of the constructor
> and destructor attributes are universally supported.  That's also
> not the case, even though the manual doesn't mention that.  To
> avoid these failures the attached patch moves the priority forms
> of the attribute constructor and destructor tests into its own
> file that's compiled only for init_priority targets.
> 
> Finally, I noticed that attribute aligned accepts zero as
> an argument even though it's not a power of two as the manual
> documents as a precondition (zero is treated the same as
> the attribute without an argument).  A zero argument is likely
> to be a mistake, especially when the zero comes from macro
> expansion, that users might want to know about.  Clang rejects
> a zero with an error but I think a warning is more in line with
> established GCC practice, so the patch also implements that.
> 
> Besides x86_64-linux, I tested this change with cross-compilers
> for aarch64-linux-elf, powerpc64le-linux, and sparc-solaris2.11.
> I added tests for the changed aligned attribute for those targets
> To make the gcc.dg/builtin-has-attribute.c test pass with
> the cross-compilers I changed from a runtime test into a compile
> only one.
> 
> Martin
> 
> PS I'm not happy about duplicating the same test across all those
> targets.  It would be much nicer to have a single test somewhere
> in dg.exp #include a target-specific header with macros describing
> the target-specific parameters.
> 
> [*] See the following discussion for some background:
>   https://gcc.gnu.org/ml/gcc/2018-11/msg00127.html
> 
> gcc-88208.diff
> 
> PR testsuite/88208 - new test case c-c++-common/builtin-has-attribute-3.c in 
> r266335 has multiple excess errors
> 
> gcc/ChangeLog:
>   PR testsuite/88208
>   * doc/extend.texi (attribute constructor): Clarify.
> 
> gcc/c/ChangeLog:
> 
>   PR testsuite/88208
>   * c-decl.c (declspec_add_alignas): Adjust call to check_user_alignment.
> 
> gcc/c-family/ChangeLog:
> 
>   PR testsuite/88208
>   * c-attribs.c (common_handle_aligned_attribute): Silently avoid setting
>   alignments to values less than the target requires.
>   (has_attribute): For attribute aligned consider both the attribute
>   and the alignment bits.
>   * c-common.c (c_init_attributes): Optionally issue a warning for
>   zero alignment.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR testsuite/88208
>   * gcc.dg/attr-aligned-2.c: New test.
>   * gcc.dg/builtin-has-attribute.c: Adjust.
>   * c-c++-common/builtin-has-attribute-2.c: Same.
>   * c-c++-common/builtin-has-attribute-3.c: Same.
>   * c-c++-common/builtin-has-attribute-4.c: Same.
>   * c-c++-common/builtin-has-attribute-5.c: New test.
>   * gcc.target/aarch64/attr-aligned.c: Same.
>   * gcc.target/i386/attr-aligned.c: Same.
>   * gcc.target/powerpc/attr-aligned.c: Same.
>   * gcc.target/sparc/attr-aligned.c: Same.
> 
> Index: gcc/c/c-decl.c
> ===
> --- gcc/c/c-decl.c(revision 266521)
> +++ gcc/c/c-decl.c(working copy)
> @@ -11061,12 +11061,15 @@ struct c_declspecs *
>  declspecs_add_alignas (location_t loc,
>  struct c_declspecs *specs, tree align)
>  {
> -  int align_log;
>specs->alignas_p = true;
>specs->locations[cdw_alignas] = loc;
>if (align == error_mark_node)
>  return specs;
> -  align_log = check_user_alignment (align, false, true);
> +
> +  /* Only accept the alignment if it's valid and greater than
> + the current one.  Zere is invalid but by C11 required to be
> + silently ignored.  */
s/Zere/Zero/

OK with the nit fixed.
jeff


Re: [PATCH] Add testcase for already fixed PR c++/87539

2018-11-28 Thread Jason Merrill

On 11/28/18 3:13 AM, Jakub Jelinek wrote:

Hi!

The PR86687 fix fixed ICE on the following testcase on mingw.  Because the
PR was about fixing debug info, I think it is worth adding this testcase to
the testsuite.  The testcase is as far as creduce managed to reduce it after
almost 18 hours.  Tested on x86_64-linux and with a cross to
x86_64-w64-mingw32, ok for trunk?


OK.

Jason



Re: [PING][PATCH] correct handling of EXCESS_PRECISION_EXPR in function calls (PR 88091)

2018-11-28 Thread Jeff Law
On 11/28/18 5:05 PM, Martin Sebor wrote:
> On 11/27/18 10:47 AM, Joseph Myers wrote:
>> On Mon, 26 Nov 2018, Martin Sebor wrote:
>>
>>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01674.html
>>
>> This patch needs an update to the comment on convert_argument to explain
>> the semantics of the new valtype parameter and how it differs from the
>> previously used TREE_TYPE (val).
> 
> I've added comments for all the function arguments.
> 
> FWIW, none of these were documented before and I hadn't really
> taken the time to study them all carefully to understand how
> each of them was being used (I just took the undocumented local
> variables and made them arguments).  So it took a bit of effort
> to figure out.  Hopefully it's close enough.
> 
> Martin
> 
> gcc-88091.diff
> 
> PR c/88091 - c-c++-common/Wconversion-real.c etc. FAIL
> 
> gcc/c/ChangeLog:
> 
>   PR c/88091
>   * c-typeck.c (convert_argument): Add a parameter.  Adjust indentation.
>   (convert_arguments): Add comments.  Pass additional argument to
>   the function above.
OK.
jeff


Re: C++ PATCH for c++/88184, ICE when treating name as template-name

2018-11-28 Thread Jason Merrill

On 11/28/18 10:48 AM, Marek Polacek wrote:

Since P0846R0 was implemented, a name will be treated as a template-name when
it is an unqualified-id followed by a < and name lookup finds either one or
more functions or finds nothing, in order to potentially cause ADL to be 
performed.

In this case, we had

   f ();

where we'd found a decl for f (not a TEMPLATE_DECL) and < follows.


From the backtrace in the PR, it seems as though we're treating f as 
non-dependent, which is wrong.  type_dependent_expression_p only looks 
at the arguments of a TEMPLATE_ID_EXPR if it has unknown_type_node, so 
we probably want to give it that type.


Jason


Re: [PATCH v4] Repeat jump threading after combine

2018-11-28 Thread Jeff Law
On 11/28/18 1:51 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Nov 27, 2018 at 05:07:11PM +0100, Ilya Leoshkevich wrote:
>> perf diff -c wdiff:1,1 shows, that there is just one function
>> (htab_traverse) that is significantly slower now:
>>
>>  2.98% 11768891764  exe[.] htab_traverse
>>  1.91%   563949986  exe[.] 
>> compute_dominance_frontiers_1
>>
>> The additional cycles consumed by this function matches the overall
>> number of additionaly consumed cycles, and the contribution of the
>> runner up (compute_dominance_frontiers_1) is 20 times smaller, so I
>> think it's really just this one function.
>>
>> However, the generated assembly is completely identical in both cases!
> 
> Ugh.  We have seen this before :-(
> 
> Thanks for investigating  I don't consider the Power degradation as really
> caused by your patch, then.
> 
>> I saw similar situations in the past, so I tried adding a nop to
>> htab_traverse:
>>
>> --- hashtab.c
>> +++ hashtab.c
>> @@ -529,6 +529,8 @@ htab_traverse (htab, callback, info)
>>   htab_trav callback;
>>   PTR info;
>>  {
>> +  __asm__ volatile("nop\n");
>> +
>>PTR *slot = htab->entries;
>>PTR *limit = slot + htab->size;
>>
>> and made a 5x re-run.  The new measurements are 227.01s and 227.44s
>> (+0.19%).  With two nops I get 227.25s and 227.29s (+0.02%), which also
>> looks like noise.
>>
>> Can this be explained by some microarchitectural quirk after all?
> 
> Two frequent branch targets that get thrown into the same bin for prediction.
> Results change based on random compiler changes, ASLR settings, phase of the
> moon, how many people in your neighbourhood have had porridge for breakfast
> this morning, etc.
FWIW, I've found the hashtable code particularly vulnerable to this kind
of performance jitter.   I've long suspected it's more related to the
data locations as I can see the jitter with the same binary running
under valgrind/cachegrind control.  ASLR being the most likely culprit
in my mind.

However, in this case it seems different -- adding a NOP is changing the
instruction stream.Could be collisions in the branch predictors or
something similar.

Ilya, can you repost the final patch?
Jeff

> 
> 
> Segher
> 



Re: [PATCH] [PR86397] set p_t_decl while canonicalizing eh specs for mangling

2018-11-28 Thread Jason Merrill

On 11/27/18 11:54 PM, Alexandre Oliva wrote:

On Nov 27, 2018, Jason Merrill  wrote:


On 11/22/18 6:40 PM, Alexandre Oliva wrote:

Mangling visits the base template function type, prior to template
resolution, and on such types, exception specifications may contain
unresolved noexcept expressions.  nothrow_spec_p is called on them
even when exception specifications are not part of function types, and
it rejects unresolved noexcept expressions if processing_template_decl
is not set.



The problem here is that the noexcept expression is unresolved even
though it isn't dependent


Yeah, but that seems to be on purpose, according to these comments, that
precede the hunk below.

   /* This isn't part of the signature, so don't bother trying to evaluate
  it until instantiation.  */



Taking out the 'flag_noexcept_type && ' subexpr fixes the problem, but
defeats the intended deferral of unnecessary computation:

diff --git a/gcc/cp/except.c b/gcc/cp/except.c
index 3449b59b3cc0..dbd233c94c3a 100644
--- a/gcc/cp/except.c
+++ b/gcc/cp/except.c
@@ -1193,7 +1193,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain)
   it until instantiation.  */
if (TREE_CODE (expr) != DEFERRED_NOEXCEPT
&& (!processing_template_decl
- || (flag_noexcept_type && !value_dependent_expression_p (expr
+ || !value_dependent_expression_p (expr)))
  {
expr = perform_implicit_conversion_flags (boolean_type_node, expr,
complain,


Let's go with this.  And remove the comment.

And the !processing_template_decl is also redundant, since that's 
checked at the top of value_dependent_expression_p.


Jason


[Bug rtl-optimization/85899] ICE in find_fallthru_edge_from, at haifa-sched.c:8059

2018-11-28 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85899

--- Comment #1 from Arseny Solokha  ---
gcc/testsuite/gcc.dg/vect/vect-cond-4.c does not need autopar:

% powerpc-e300c3-linux-gnu-gcc-9.0.0-alpha20181125 -O1 -fschedule-insns
-fsel-sched-pipelining -fsel-sched-pipelining-outer-loops
-fselective-scheduling -funroll-all-loops -fno-section-anchors -fno-tree-ch
-fno-tree-dominator-opts --param max-completely-peeled-insns=52 --param
max-unrolled-insns=29 -c gcc/testsuite/gcc.dg/vect/vect-cond-4.c
during RTL pass: sched1
gcc/testsuite/gcc.dg/vect/vect-cond-4.c: In function 'foo':
gcc/testsuite/gcc.dg/vect/vect-cond-4.c:37:1: internal compiler error: in
find_fallthru_edge_from, at haifa-sched.c:8085
   37 | }
  | ^
0x7462ba find_fallthru_edge_from(basic_block_def*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/haifa-sched.c:8085
0xcde34f in_fallthru_bb_p
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sel-sched.c:608
0xcde34f extract_new_fences_from
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sel-sched.c:680
0xcde34f calculate_new_fences
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sel-sched.c:7429
0xcde34f sel_sched_region_2
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sel-sched.c:7505
0xcde892 sel_sched_region_1
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sel-sched.c:7546
0xcdf17e sel_sched_region(int)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sel-sched.c:7647
0xce01b9 run_selective_scheduling()
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sel-sched.c:7733
0xcbeef4 rest_of_handle_sched
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sched-rgn.c:3718
0xcbeef4 execute
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20181125/work/gcc-9-20181125/gcc/sched-rgn.c:3828

Re: [PATCH, rs6000] Fix PR87496: ICE in aggregate_value_p at gcc/function.c:2046

2018-11-28 Thread Peter Bergner
On 11/28/18 3:27 PM, Peter Bergner wrote:
> Ok for mainline once bootstrap and regtesting are complete and clean?
> 
> Peter
> 
> 
> gcc/
>   PR target/87496
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Disallow
>   -mabi=ieeelongdouble without both -mpopcntd and -mvsx.
> 
> gcc/testsuite/
>   PR target/87496
>   * gcc.target/powerpc/pr87496.c: New test.

FYI, bootstrap and regtesting completed with no regressions.

Peter



Re: [C++ PATCH] Fix wrong-code with generic lambda (PR c++/86943)

2018-11-28 Thread Jason Merrill

On 11/27/18 5:41 PM, Jakub Jelinek wrote:

On Tue, Nov 27, 2018 at 05:20:56PM -0500, Jason Merrill wrote:

On 11/23/18 4:15 PM, Jakub Jelinek wrote:

Hi!

On the following testcase, the call to operator () is marked as
CALL_FROM_THUNK_P and therefore genericization ignores all subtrees thereof.
Unfortunately, one of the arguments is a move ctor call which is not itself
CALL_FROM_THUNK_P and thus we need to genericize its arguments, otherwise
we pass address of a temporary which holds a reference value instead of the
reference itself.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Or should CALL_FROM_THUNK_P not be set in this call (it is set in
maybe_add_lambda_conv_op and then copied over during tsubst*).


The call with CALL_FROM_THUNK_P set should be inside the thunk whose address
we get when converting the lambda to a function pointer (on return from
foo).  Its arguments should just be the parms of that thunk, I don't know
where this temporary is coming from.


In the *.original dump this is:
<::operator() (0B, _EXPR >>
   (struct S &)  ) >;
return;

where CALL_FROM_THUNK_P is set on the call to
foo()operator() and D.2402 is the
PARM_DECL which shouldn't have the &.
In *.gimple it is:
foo()_FUN (struct S & restrict D.2402) // the arg is 
DECL_BY_REFERENCE
{
   struct S D.2406;
   struct S & D.2413;

   D.2413 = D.2402;
   S::S (, ); // this ctor wants S & arg, so it should be just 
D.2402
   try
 {
   try
 {
   foo()operator() (0B, );
 }
   finally
 {
   S::~S ();
 }
 }
   finally
 {
   D.2406 = {CLOBBER};
 }
   return;
}

CALL_FROM_THUNK_P is set on the operator() call in maybe_add_lambda_conv_op
and at that the call is:
  
 readonly
 arg:0 
 constant
 arg:0 
 constant
 arg:0 >>>
 arg:1 >>
 arg:0  chain >
 side-effects
 arg:0 
 side-effects arg:0 >>>
The rest (TARGET_EXPR, ctor etc.) appears during instantiation.


Sounds like that's the bug, then; instantiation is failing to preserve 
the pass-through nature of the call.


Jason


Re: [C++ Patch] Improve compute_array_index_type locations

2018-11-28 Thread Jason Merrill

On 11/6/18 4:01 AM, Paolo Carlini wrote:
when I improved create_array_type_for_decl I didn't notice that it calls 
compute_array_index_type as helper, which simply needs to have the 
location information propagated. Tested x86_64-linux.


This looks like it will use the declarator-id location in diagnostics 
about the array bound expression, which isn't optimal; if 'size' has a 
location, we should use that instead.


Jason


[Bug fortran/88254] New: Support construct name for CHANGE TEAM & END TEAM

2018-11-28 Thread weeks at iastate dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88254

Bug ID: 88254
   Summary: Support construct name for CHANGE TEAM & END TEAM
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: weeks at iastate dot edu
  Target Milestone: ---

The Fortran 2018 CHANGE TEAM and END TEAM statements should support a construct
name. Support is current lacking in gfortran 8.2.0; see the example below (and
also example C.6.8 from the N2146 draft of the Fortran 2018 standard).

===
$ cat team_construct_name.f90
program test_team_construct_name
   use, intrinsic :: iso_fortran_env, only: team_type
   implicit none

   type(team_type) :: team

   form team (1, team)

   team_construct_name: change team(team)
  if (.true.) exit team_construct_name
  error stop 'fail'
   end team team_construct_name

   write(*,*) 'pass'
end program
$ gfortran-mp-8 team_construct_name.f90
team_construct_name.f90:9:3:

team_construct_name: change team(team)
   1
Error: Unclassifiable statement at (1)
team_construct_name.f90:10:42:

   if (.true.) exit team_construct_name
  1
Error: Name 'team_construct_name' in EXIT statement at (1) is unknown
team_construct_name.f90:12:12:

end team team_construct_name
1
Error: Unclassifiable statement at (1)
$ gfortran-mp-8 -v 
Using built-in specs.
COLLECT_GCC=gfortran-mp-8
COLLECT_LTO_WRAPPER=/opt/local/libexec/gcc/x86_64-apple-darwin17/8.2.0/lto-wrapper
Target: x86_64-apple-darwin17
Configured with:
/opt/local/var/macports/build/_opt_bblocal_var_buildworker_ports_build_ports_lang_gcc8/gcc8/work/gcc-8.2.0/configure
--prefix=/opt/local --build=x86_64-apple-darwin17
--enable-languages=c,c++,objc,obj-c++,lto,fortran --libdir=/opt/local/lib/gcc8
--includedir=/opt/local/include/gcc8 --infodir=/opt/local/share/info
--mandir=/opt/local/share/man --datarootdir=/opt/local/share/gcc-8
--with-local-prefix=/opt/local --with-system-zlib --disable-nls
--program-suffix=-mp-8 --with-gxx-include-dir=/opt/local/include/gcc8/c++/
--with-gmp=/opt/local --with-mpfr=/opt/local --with-mpc=/opt/local
--with-isl=/opt/local --enable-stage1-checking --disable-multilib --enable-lto
--enable-libstdcxx-time --with-build-config=bootstrap-debug
--with-as=/opt/local/bin/as --with-ld=/opt/local/bin/ld
--with-ar=/opt/local/bin/ar --with-bugurl=https://trac.macports.org/newticket
--disable-tls --with-pkgversion='MacPorts gcc8 8.2.0_3'
Thread model: posix
gcc version 8.2.0 (MacPorts gcc8 8.2.0_3)
===

[Bug libstdc++/86910] std::filesystem::create_directories doesn't set error code or throw while violating postcondition.

2018-11-28 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86910

--- Comment #9 from Jonathan Wakely  ---
Author: redi
Date: Thu Nov 29 00:39:37 2018
New Revision: 266598

URL: https://gcc.gnu.org/viewcvs?rev=266598=gcc=rev
Log:
PR libstdc++/86910 fix filesystem::create_directories

Implement the proposed semantics from P1164R0, which reverts the changes
of LWG 2935. This means that failure to create a directory because a
non-directory already exists with that name will be reported as an
error.

While rewriting the function, also fix PR 87846, which is a result of
the C++17 changes to how a trailing slash on a path affects the last
component of a path.

PR libstdc++/86910
PR libstdc++/87846
* src/filesystem/ops.cc (experimental::create_directories): Report
an error when the path resolves to an existing non-directory (P1164).
* src/filesystem/std-ops.cc (create_directories): Likewise. Handle
empty filenames due to trailing slashes.
* testsuite/27_io/filesystem/operations/create_directories.cc: Test
when some component of the path exists and is not a directory. Test
trailing slashes.
* testsuite/experimental/filesystem/operations/create_directories.cc:
Likewise.

Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/src/filesystem/ops.cc
trunk/libstdc++-v3/src/filesystem/std-ops.cc
   
trunk/libstdc++-v3/testsuite/27_io/filesystem/operations/create_directories.cc
   
trunk/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc

[Bug libstdc++/87846] std::filesystem::create_directories with a path with a trailing slash does not create any directory

2018-11-28 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87846

--- Comment #2 from Jonathan Wakely  ---
Author: redi
Date: Thu Nov 29 00:39:37 2018
New Revision: 266598

URL: https://gcc.gnu.org/viewcvs?rev=266598=gcc=rev
Log:
PR libstdc++/86910 fix filesystem::create_directories

Implement the proposed semantics from P1164R0, which reverts the changes
of LWG 2935. This means that failure to create a directory because a
non-directory already exists with that name will be reported as an
error.

While rewriting the function, also fix PR 87846, which is a result of
the C++17 changes to how a trailing slash on a path affects the last
component of a path.

PR libstdc++/86910
PR libstdc++/87846
* src/filesystem/ops.cc (experimental::create_directories): Report
an error when the path resolves to an existing non-directory (P1164).
* src/filesystem/std-ops.cc (create_directories): Likewise. Handle
empty filenames due to trailing slashes.
* testsuite/27_io/filesystem/operations/create_directories.cc: Test
when some component of the path exists and is not a directory. Test
trailing slashes.
* testsuite/experimental/filesystem/operations/create_directories.cc:
Likewise.

Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/src/filesystem/ops.cc
trunk/libstdc++-v3/src/filesystem/std-ops.cc
   
trunk/libstdc++-v3/testsuite/27_io/filesystem/operations/create_directories.cc
   
trunk/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc

[PATCH] PR libstdc++/86910 fix filesystem::create_directories

2018-11-28 Thread Jonathan Wakely

Implement the proposed semantics from P1164R0, which reverts the changes
of LWG 2935. This means that failure to create a directory because a
non-directory already exists with that name will be reported as an
error.

While rewriting the function, also fix PR 87846, which is a result of
the C++17 changes to how a trailing slash on a path affects the last
component of a path.

PR libstdc++/86910
PR libstdc++/87846
* src/filesystem/ops.cc (experimental::create_directories): Report
an error when the path resolves to an existing non-directory (P1164).
* src/filesystem/std-ops.cc (create_directories): Likewise. Handle
empty filenames due to trailing slashes.
* testsuite/27_io/filesystem/operations/create_directories.cc: Test
when some component of the path exists and is not a directory. Test
trailing slashes.
* testsuite/experimental/filesystem/operations/create_directories.cc:
Likewise.

Tested powerpc64le-linux, committed to trunk.

commit 822dd7aa5c35109af9607e1fecd17724c296e0d7
Author: Jonathan Wakely 
Date:   Thu Nov 29 00:10:08 2018 +

PR libstdc++/86910 fix filesystem::create_directories

Implement the proposed semantics from P1164R0, which reverts the changes
of LWG 2935. This means that failure to create a directory because a
non-directory already exists with that name will be reported as an
error.

While rewriting the function, also fix PR 87846, which is a result of
the C++17 changes to how a trailing slash on a path affects the last
component of a path.

PR libstdc++/86910
PR libstdc++/87846
* src/filesystem/ops.cc (experimental::create_directories): Report
an error when the path resolves to an existing non-directory 
(P1164).
* src/filesystem/std-ops.cc (create_directories): Likewise. Handle
empty filenames due to trailing slashes.
* testsuite/27_io/filesystem/operations/create_directories.cc: Test
when some component of the path exists and is not a directory. Test
trailing slashes.
* 
testsuite/experimental/filesystem/operations/create_directories.cc:
Likewise.

diff --git a/libstdc++-v3/src/filesystem/ops.cc 
b/libstdc++-v3/src/filesystem/ops.cc
index 40cadbf6270..a6ae75ff734 100644
--- a/libstdc++-v3/src/filesystem/ops.cc
+++ b/libstdc++-v3/src/filesystem/ops.cc
@@ -423,6 +423,19 @@ fs::create_directories(const path& p, error_code& ec) 
noexcept
   ec = std::make_error_code(errc::invalid_argument);
   return false;
 }
+
+  file_status st = symlink_status(p, ec);
+  if (is_directory(st))
+return false;
+  else if (ec && !status_known(st))
+return false;
+  else if (exists(st))
+{
+  if (!ec)
+   ec = std::make_error_code(std::errc::not_a_directory);
+  return false;
+}
+
   std::stack missing;
   path pp = p;
 
@@ -431,24 +444,29 @@ fs::create_directories(const path& p, error_code& ec) 
noexcept
   ec.clear();
   const auto& filename = pp.filename();
   if (!is_dot(filename) && !is_dotdot(filename))
-   missing.push(pp);
-  pp.remove_filename();
+   {
+ missing.push(std::move(pp));
+ pp = missing.top().parent_path();
+   }
+  else
+   pp = pp.parent_path();
 }
 
   if (ec || missing.empty())
 return false;
 
+  bool created;
   do
 {
   const path& top = missing.top();
-  create_directory(top, ec);
-  if (ec && is_directory(top))
-   ec.clear();
+  created = create_directory(top, ec);
+  if (ec)
+   return false;
   missing.pop();
 }
-  while (!missing.empty() && !ec);
+  while (!missing.empty());
 
-  return missing.empty();
+  return created;
 }
 
 namespace
diff --git a/libstdc++-v3/src/filesystem/std-ops.cc 
b/libstdc++-v3/src/filesystem/std-ops.cc
index e266fa6d3f8..2f9a76ffaa1 100644
--- a/libstdc++-v3/src/filesystem/std-ops.cc
+++ b/libstdc++-v3/src/filesystem/std-ops.cc
@@ -646,38 +646,74 @@ fs::create_directories(const path& p, error_code& ec)
   ec = std::make_error_code(errc::invalid_argument);
   return false;
 }
+
+  file_status st = symlink_status(p, ec);
+  if (is_directory(st))
+return false;
+  else if (ec && !status_known(st))
+return false;
+  else if (exists(st))
+{
+  if (!ec)
+   ec = std::make_error_code(std::errc::not_a_directory);
+  return false;
+}
+
   std::stack missing;
   path pp = p;
 
-  while (pp.has_filename() && status(pp, ec).type() == file_type::not_found)
-{
-  ec.clear();
-  const auto& filename = pp.filename();
-  if (!is_dot(filename) && !is_dotdot(filename))
-   missing.push(pp);
-  pp = pp.parent_path();
-
-  if (missing.size() > 1000) // sanity check
-   {
- ec = std::make_error_code(std::errc::filename_too_long);
- return false;
-   }
-}
-
-  if (ec || 

Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-28 Thread Jonathan Wakely

On 28/11/18 12:12 -0500, Ed Smith-Rowland wrote:

Index: testsuite/21_strings/basic_string/erasure.cc
===
--- testsuite/21_strings/basic_string/erasure.cc(nonexistent)
+++ testsuite/21_strings/basic_string/erasure.cc(working copy)
@@ -0,0 +1,53 @@
+// { dg-do run { target c++2a } }
+


None of these new tests actually run by default, because they are
gated to only run for C++2a and the default is gnu++14. That means
they're all skipped as UNSUPPORTED.

(When I add new tests I always try to remember to check the
testsuite/libstdc++.sum file to make sure they are actually running).

The tests need an explicit -std option added via dg-options, which has
to come before the dg-do directive, otherwise the target check still
uses the default options i.e.

// { dg-options "-std=gnu++2a" }
// { dg-do run { target c++2a } }

With that added, most of them start to FAIL:

FAIL: 23_containers/deque/erasure.cc (test for excess errors)
UNRESOLVED: 23_containers/deque/erasure.cc compilation failed to produce 
executable
FAIL: 23_containers/unordered_set/erasure.cc (test for excess errors)
UNRESOLVED: 23_containers/unordered_set/erasure.cc compilation failed to 
produce executable
FAIL: 23_containers/vector/erasure.cc (test for excess errors)
UNRESOLVED: 23_containers/vector/erasure.cc compilation failed to produce 
executable
FAIL: experimental/deque/erasure.cc (test for excess errors)
UNRESOLVED: experimental/deque/erasure.cc compilation failed to produce 
executable
FAIL: experimental/forward_list/erasure.cc (test for excess errors)
UNRESOLVED: experimental/forward_list/erasure.cc compilation failed to produce 
executable
FAIL: experimental/list/erasure.cc (test for excess errors)
UNRESOLVED: experimental/list/erasure.cc compilation failed to produce 
executable
FAIL: experimental/vector/erasure.cc (test for excess errors)
UNRESOLVED: experimental/vector/erasure.cc compilation failed to produce 
executable


The errors are all like:

In file included from 
/home/jwakely/src/gcc/gcc/libstdc++-v3/testsuite/23_containers/vector/erasure.cc:21:
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector: In function 
'void std::erase_if(std::vector<_Tp, _Alloc>&, _Predicate)':
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector:101:
 error: 'remove_if' is not a member of 'std'; did you mean 'remove_cv'?
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector: In function 
'void std::erase(std::vector<_Tp, _Alloc>&, const _Up&)':
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector:109:
 error: 'remove' is not a member of 'std'; did you mean 'move'?

This is because std::remove and std::remove_if are defined in
 which is not included.

Could you please fix this ASAP?




[Bug target/88096] wrong inline AVX512F optimization

2018-11-28 Thread andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88096

Andi Kleen  changed:

   What|Removed |Added

 CC||andi-gcc at firstfloor dot org

--- Comment #1 from Andi Kleen  ---
Can you please attach a pre-processed test case of a file that shows the bug?

It's ok if it doesn't run, as long as the problem is clearly identified in the
assembler.

Then the test case could be likely minimized.

Re: [PING][PATCH] correct handling of EXCESS_PRECISION_EXPR in function calls (PR 88091)

2018-11-28 Thread Martin Sebor

On 11/27/18 10:47 AM, Joseph Myers wrote:

On Mon, 26 Nov 2018, Martin Sebor wrote:


Ping: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01674.html


This patch needs an update to the comment on convert_argument to explain
the semantics of the new valtype parameter and how it differs from the
previously used TREE_TYPE (val).


I've added comments for all the function arguments.

FWIW, none of these were documented before and I hadn't really
taken the time to study them all carefully to understand how
each of them was being used (I just took the undocumented local
variables and made them arguments).  So it took a bit of effort
to figure out.  Hopefully it's close enough.

Martin
PR c/88091 - c-c++-common/Wconversion-real.c etc. FAIL

gcc/c/ChangeLog:

	PR c/88091
	* c-typeck.c (convert_argument): Add a parameter.  Adjust indentation.
	(convert_arguments): Add comments.  Pass additional argument to
	the function above.

Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c	(revision 266594)
+++ gcc/c/c-typeck.c	(working copy)
@@ -3186,18 +3186,25 @@ c_build_function_call_vec (location_t loc, vec "
+			"rather than % due to prototype",
+			argnum, rname);
 
-	  if (INTEGRAL_TYPE_P (type)
-	  && TREE_CODE (valtype) == REAL_TYPE)
+	  /* Warn if mismatch between argument and prototype
+	 for decimal float types.  Warn of conversions with
+	 binary float types and of precision narrowing due to
+	 prototype.  */
+	  else if (type != valtype
+		   && (type == dfloat32_type_node
+		   || type == dfloat64_type_node
+		   || type == dfloat128_type_node
+		   || valtype == dfloat32_type_node
+		   || valtype == dfloat64_type_node
+		   || valtype == dfloat128_type_node)
+		   && (formal_prec
+		   <= TYPE_PRECISION (valtype)
+		   || (type == dfloat128_type_node
+			   && (valtype
+			   != dfloat64_type_node
+			   && (valtype
+   != dfloat32_type_node)))
+		   || (type == dfloat64_type_node
+			   && (valtype
+			   != dfloat32_type_node
+	warning_at (ploc, 0,
+			"passing argument %d of %qE as %qT "
+			"rather than %qT due to prototype",
+			argnum, rname, type, valtype);
+
+	}
+  /* Detect integer changing in width or signedness.
+	 These warnings are only activated with
+	 -Wtraditional-conversion, not with -Wtraditional.  */
+  else if (warn_traditional_conversion
+	   && INTEGRAL_TYPE_P (type)
+	   && INTEGRAL_TYPE_P (valtype))
+	{
+	  tree would_have_been = default_conversion (val);
+	  tree type1 = TREE_TYPE (would_have_been);
+
+	  if (val == error_mark_node)
+	/* VAL could have been of incomplete type.  */;
+	  else if (TREE_CODE (type) == ENUMERAL_TYPE
+		   && (TYPE_MAIN_VARIANT (type)
+		   == TYPE_MAIN_VARIANT (valtype)))
+	/* No warning if function asks for enum
+	   and the actual arg is that enum type.  */
+	;
+	  else if (formal_prec != TYPE_PRECISION (type1))
 	warning_at (ploc, OPT_Wtraditional_conversion,
-			"passing argument %d of %qE as integer rather "
-			"than floating due to prototype",
+			"passing argument %d of %qE "
+			"with different width due to prototype",
 			argnum, rname);
-	  if (INTEGRAL_TYPE_P (type)
-	  && TREE_CODE (valtype) == COMPLEX_TYPE)
+	  else if (TYPE_UNSIGNED (type) == TYPE_UNSIGNED (type1))
+	;
+	  /* Don't complain if the formal parameter type
+	 is an enum, because we can't tell now whether
+	 the value was an enum--even the same enum.  */
+	  else if (TREE_CODE (type) == ENUMERAL_TYPE)
+	;
+	  else if (TREE_CODE (val) == INTEGER_CST
+		   && int_fits_type_p (val, type))
+	/* Change in signedness doesn't matter
+	   if a constant value is unaffected.  */
+	;
+	  /* If the value is extended from a narrower
+	 unsigned type, it doesn't matter whether we
+	 pass it as signed or unsigned; the value
+	 certainly is the same either way.  */
+	  else if (TYPE_PRECISION (valtype) < TYPE_PRECISION (type)
+		   && TYPE_UNSIGNED (valtype))
+	;
+	  else if (TYPE_UNSIGNED (type))
 	warning_at (ploc, OPT_Wtraditional_conversion,
-			"passing argument %d of %qE as integer rather "
-			"than complex due to prototype",
+			"passing argument %d of %qE "
+			"as unsigned due to prototype",
 			argnum, rname);
-	  else if (TREE_CODE (type) == COMPLEX_TYPE
-		   && TREE_CODE (valtype) == REAL_TYPE)
+	  else
 	warning_at (ploc, OPT_Wtraditional_conversion,
-			"passing argument %d of %qE as complex rather "
-			"than floating due to prototype",
+			"passing argument %d of %qE "
+			"as signed due to prototype",
 			argnum, rname);
-	  else if (TREE_CODE (type) == REAL_TYPE
-		   && INTEGRAL_TYPE_P (valtype))
-	warning_at (ploc, OPT_Wtraditional_conversion,
-			"passing argument %d of %qE as floating rather "
-			"than integer due to prototype",
-			argnum, rname);
-	  else if (TREE_CODE (type) == COMPLEX_TYPE
-		   && INTEGRAL_TYPE_P (valtype))
-	

[Bug rtl-optimization/88253] New: Inlining of function incorrectly deletes volatile register access when using XOR in avr-gcc

2018-11-28 Thread westfw at westfw dot info
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88253

Bug ID: 88253
   Summary: Inlining of function incorrectly deletes volatile
register access when using XOR in avr-gcc
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: westfw at westfw dot info
  Target Milestone: ---

Created attachment 45116
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45116=edit
.i pre-processed file from --keep-temps

Using avr-gcc (AVR_8_bit_GNU_Toolchain_3.6.1_495) 5.4.0
(bug also occurs in avr-gcc 4.9 and 8.1; does NOT occur in 4.8.1)

A rather trivial C program:

#include 
uint16_t aRead() {
uint8_t h;
uint8_t l;
l = ADCL;
h = ADCH;
return (h<<8) | l;
}

int main() {
volatile uint8_t x;
x = aRead()^42;
}

only reads ADCL when aRead() is inlined from main; the read of (volatile) ADCH
is apparently removed by an optimization step, which breaks things because that
read has side-effects at the hardware level.

 :
   0:   cf 93   pushr28
   2:   df 93   pushr29
   4:   1f 92   pushr1
   6:   cd b7   in  r28, 0x3d   ; 61
   8:   de b7   in  r29, 0x3e   ; 62
   a:   80 91 78 00 lds r24, 0x0078 ; 0x800078 <__SREG__+0x800039>
   e:   9a e2   ldi r25, 0x2A   ; 42
  10:   89 27   eor r24, r25
  12:   89 83   std Y+1, r24; 0x01

This seems to happen during or slightly before the .combine pass of
optimization; the ud_dce output dump shows both volatile accesses, and the
combine has only one.   Does not happen using inclusive-OR, or a plain store.
(yes, I know that the value of ADCH that is read is never actually used.  But
it's volatile, and has side effects, and should be read.)

General compiler command: "avr-gcc -O3 -g adctest.i"
Seems to occur regardless of the reason for inlining - happens with "-Os -fto"
or with -Os and the always_inline attribute specified.

Some discussion at
https://www.avrfreaks.net/forum/avr-gcc-optimization-xor-incorrectly-eliminates-volatile-access

[Bug c++/88103] [7/8/9 Regression] Wrong value category when conditional expression result is used as object expression

2018-11-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88103

--- Comment #3 from Marc Glisse  ---
Possibly unrelated, but PR 67795 is another example where ?: has the wrong
value category depending on where it is used (cast vs return).

>From my comment in https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00234.html ,
we were still not implementing the full standard rules after my patch, I don't
know if that happened since. Still, I am surprised if my patch caused those
errors.

Re: [RS6000] CONST_DOUBLE tidy

2018-11-28 Thread Segher Boessenkool
On Thu, Nov 29, 2018 at 10:18:40AM +1030, Alan Modra wrote:
> On Wed, Nov 28, 2018 at 12:14:55PM -0600, Segher Boessenkool wrote:
> > On Sun, Nov 25, 2018 at 10:46:28PM +1030, Alan Modra wrote:
> > > -  /* If we are using V.4 style PIC, consider all constants to be hard.  
> > > */
> > > -  if (flag_pic && DEFAULT_ABI == ABI_V4)
> > > -return 0;
> > 
> > Did you remove this part by accident?  It isn't mentioned in the changelog.
> 
> It was deliberate.  Leaving it in would have meant we had
> 
>   ...
>   if (flag_pic && DEFAULT_ABI == ABI_V4)
> return 0;
> 
>   return 0;

Oh, ha :-)  But please put it in the changelog.

Thanks,


Segher


Re: write w/o approval policy (Re: [PATCH] clarify comments for implicit_p flag for built-ins)

2018-11-28 Thread Jeff Law
On 11/28/18 11:39 AM, Martin Sebor wrote:
> On 11/28/18 6:35 AM, Richard Biener wrote:
>> On Tue, Nov 27, 2018 at 3:52 AM Martin Sebor  wrote:
>>>
>>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01759.html
>>>
>>> If there are no objections or suggestions for tweaks I'll commit
>>> this updated comment this week.
>>
>> Please do not commit such changes w/o approval.
> 
> Since you're the second maintainer to ask me that in response
> to a patch to update comments I'd like to get some clarity here.
> 
> I have been assuming that the GCC Write access policy (quoted
> below) lets those of us with write-after-approval make a judgment
> call as to when a change is sufficiently safe to commit:
> 
>   Obvious fixes can be committed without prior approval.  Just
>   check in the fix and copy it to gcc-patches. A good test to
>   determine whether a fix is obvious: "will the person who
>   objects to my work the most be able to find a fault with my
>   fix?"  If the fix is later found to be faulty, it can always
>   be rolled back. We don't want to get overly restrictive about
>   checkin policies.
> 
>   (https://www.gnu.org/software/gcc/svnwrite.html#policies)
> 
> If we are not at liberty to make this judgment call in even
> the most innocuous cases like comments, when does this policy
> actually apply?  (It should be updated to make it clear.)
The thing is I looked at the patch and it was far from obvious what was
going on.  Thus I put it in my queue of things to dig deeper into.  I
haven't done that digging yet.

Comments are actually important.  They often describe what the code is
supposed to do, rationale, historical context, etc.  Just because we're
changing a comment doesn't mean it's inherently trivial/obvious.

I'm generally supportive of lessening friction for developers and I
welcome proposals to do that.

Jeff


Re: [RS6000] CONST_DOUBLE tidy

2018-11-28 Thread Alan Modra
On Wed, Nov 28, 2018 at 12:14:55PM -0600, Segher Boessenkool wrote:
> On Sun, Nov 25, 2018 at 10:46:28PM +1030, Alan Modra wrote:
> > -  /* If we are using V.4 style PIC, consider all constants to be hard.  */
> > -  if (flag_pic && DEFAULT_ABI == ABI_V4)
> > -return 0;
> 
> Did you remove this part by accident?  It isn't mentioned in the changelog.

It was deliberate.  Leaving it in would have meant we had

  ...
  if (flag_pic && DEFAULT_ABI == ABI_V4)
return 0;

  return 0;
}

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFA: a x86 test modification

2018-11-28 Thread Jeff Law
On 11/28/18 2:47 PM, Vladimir Makarov wrote:
>   The patch I committed today recently for
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88207
> 
>   creates a new regression for pr34256.c.  2 moves is expected but gcc
> with the patch generates 3 moves.  I think now RA generates the right code.
> 
> We have the following code before RA
> 
> (insn 7 6 13 2 (set (reg:V2SI 88)
> (plus:V2SI (reg:V2SI 89 [ x ])
>     (mem/c:V2SI (symbol_ref:DI ("y") [flags 0x2]   0x7faad08b5b40 y>) [1 y+0 S8 A64])))"mmintrin.h":306:18 1115 {*mmx_addv2si3}
> 
>  (expr_list:REG_DEAD (reg:V2SI 89 [ x ])
> (nil)))
> (insn 13 7 14 2 (set (reg/i:DI 0 ax)
> (subreg:DI (reg:V2SI 88) 0)) "pr34256.c":11:1 66 {*movdi_internal}
>  (expr_list:REG_DEAD (reg:V2SI 88)
> 
> The test is expected to assign mmx reg to pseudo 88 but gcc with the
> patch assigns memory to it.  The cost of mmx to general reg move is 13,
> while overall cost of mmx to mem and mem to general moves is 10.  So IRA
> now chooses memory for pseudo 88 according to the minimal cost.
> 
> Now, if we want still assign mmx reg to the pseudo 88, we should change
> the costs in machine-dependent x86 code.  But I think it might create
> other unexpected code generation.  As mmx is basically not used nowadays
> the test is not important, I just propose the following patch.
> 
> Is it ok for the trunk?
> 
> 
> 
> 
> pr88207-2.patch
> 
> Index: testsuite/ChangeLog
> ===
> --- testsuite/ChangeLog   (revision 266582)
> +++ testsuite/ChangeLog   (working copy)
> @@ -1,3 +1,7 @@
> +2018-11-28  Vladimir Makarov  
> +
> + * gcc.target/i386/pr34256.c: Adjust the number of expected moves.
OK.
jeff


Re: [PATCH/coding style] clarify pointers and operators

2018-11-28 Thread Martin Sebor

On 11/27/18 10:20 PM, Sandra Loosemore wrote:

On 11/26/18 10:59 AM, Martin Sebor wrote:

Martin suggested we update the Coding Conventions to describe
the expected style for function declarations with a pointer
return types, and for overloaded operators.  Below is the patch.

As an aside, regarding the space convention in casts: a crude
grep search yields about 10,000 instances of the "(type)x" kinds
of casts in GCC sources and 40,000 of the preferred "(type) x"
style with the space.  That's a consistency of only 80%.  Is
it worth documenting a preference for a convention that's so
inconsistently followed?

Martin

Index: htdocs/codingconventions.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v
retrieving revision 1.85
diff -u -r1.85 codingconventions.html
--- htdocs/codingconventions.html    30 Sep 2018 14:38:46 -    1.85
+++ htdocs/codingconventions.html    26 Nov 2018 17:59:21 -
@@ -803,12 +803,17 @@
    - x
  
    cast
-  (foo) x
-  (foo)x
+  (type) x
+  (type)x
  
-  pointer dereference
-  *x
-  * x
+  pointer cast
+  (type *) x
+  (type*)x
+
+
+  pointer return type
+  type *f (void)
+  type* f (void)
  
  

@@ -992,6 +997,21 @@
  Rationale and Discussion
  

+
+Note: in declarations of operator functions or in invocations of
+such functions that involve the keyword operator


This sentence would be more readable with a comma inserted here (after 
operator), I think.



+the full name of the operator should be considered as including
+the keyword with no spaces in between the keyowrd and the operator


s/keyowrd/keyword/


+token.  Thus, the expected format of a declaration of an operator
+is
+    T operator== (const T  const T );
+
+and not for example


s/ for example//


+    T operator == (const T  const T );
+
+(with the space between operator and ==).
+
+


I've made these changes.

Thanks
Martin


[PATCH] Restore aarch64 support for asm ("# %a0" : : "i" (0)) (PR target/87598)

2018-11-28 Thread Jakub Jelinek
Hi!

As mentioned in the PR, aarch64 used to allow VOIDmode CONST_INTs
as %aN operands, but r255230 started ICEing on it and r257907 turned
that ICE into error (output_operand_lossage).

The following patch restores the previous behavior, by allowing such
CONST_INTs through.  They will fail aarch64_classify_address a few lines
later and so aarch64_print_address_internal will return false and either
cause output_operand_lossage there, or if it is aarch64_print_address,
let the generic code handle the constant.

Bootstrapped/regtested on aarch64-linux on GCCFarm, ok for trunk?

2018-11-29  Jakub Jelinek  

PR target/87598
* config/aarch64/aarch64.c (aarch64_print_address_internal): Don't
call output_operand_lossage on VOIDmode CONST_INTs.  After
output_operand_lossage do return false.

* gcc.target/aarch64/asm-5.c: New test.

--- gcc/config/aarch64/aarch64.c.jj 2018-11-26 22:21:24.891607602 +0100
+++ gcc/config/aarch64/aarch64.c2018-11-27 14:16:48.586358824 +0100
@@ -7635,8 +7635,14 @@ aarch64_print_address_internal (FILE *f,
   unsigned int size;
 
   /* Check all addresses are Pmode - including ILP32.  */
-  if (GET_MODE (x) != Pmode)
-output_operand_lossage ("invalid address mode");
+  if (GET_MODE (x) != Pmode
+  && (GET_MODE (x) != VOIDmode
+ || !CONST_INT_P (x)
+ || trunc_int_for_mode (INTVAL (x), Pmode) != INTVAL (x)))
+{
+  output_operand_lossage ("invalid address mode");
+  return false;
+}
 
   if (aarch64_classify_address (, x, mode, true, type))
 switch (addr.type)
--- gcc/testsuite/gcc.target/aarch64/asm-5.c.jj 2018-11-27 14:24:18.774957407 
+0100
+++ gcc/testsuite/gcc.target/aarch64/asm-5.c2018-11-27 14:19:01.290176187 
+0100
@@ -0,0 +1,8 @@
+/* PR target/87598 */
+/* { dg-do compile } */
+
+void
+foo (void)
+{
+  __asm__ ("# %a0" : : "i" (0));
+}

Jakub


Re: [RS6000] "G" and "H" constraints

2018-11-28 Thread Alan Modra
On Wed, Nov 28, 2018 at 11:55:21AM -0600, Segher Boessenkool wrote:
> >  (define_constraint "G"
> > -  "Constant that can be copied into GPR with two insns for DF/DI
> > -   and one for SF."
> > +  "Constant that can be copied into GPR with two insns for DF/DD
> > +   and one for SF/SD."
> >(and (match_code "const_double")
> > (match_test "num_insns_constant (op, mode)
> > == (mode == SFmode ? 1 : 2)")))
> 
> The code does 2 for SDmode, the comment says 1.  Which is correct?

The comment is correct.  I'll fix the code to do
"mode == SFmode || mode == SDmode".

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/5][libbacktrace] Factor out backtrace_vector_free

2018-11-28 Thread Ian Lance Taylor via gcc-patches
On Wed, Nov 28, 2018 at 3:15 PM, Tom de Vries  wrote:
>
> this patch factors out new function backtrace_vector_free.
>
> Bootstrapped and reg-tested on x86_64.
>
> OK for trunk?

We should only add new files if we really absolutely must, as this
package is copied around to a lot of places (e.g.,
libsanitizer/libbacktrace) and adding files here requires
modifications in all those places.

If we do have to add a new file, which I don't think we do, it should
not have a leading "backtrace-" in the name, as none of the existing
files have such a prefix.

Ian



> Thanks,
> - Tom
>
> [libbacktrace] Factor out backtrace_vector_free
>
> 2018-11-28  Tom de Vries  
>
> * Makefile.am (libbacktrace_la_SOURCES): Add backtrace-vector.
> * Makefile.in: Regenerate.
> * backtrace-vector.c: New file.
> (backtrace_vector_free): New fuction, factored out of ...
> * dwarf.c (read_line_info): ... here.
> * internal.h (backtrace_vector_free): Declare.
>
> * libbacktrace/Makefile.am (libsanitizer_libbacktrace_la_SOURCES): Add
> backtrace-vector.c.
> * libbacktrace/Makefile.in: Regenerate.
> * libbacktrace/backtrace-rename.h (backtrace_vector_free): Define to
> __asan_backtrace_vector_free.
>
> ---
>  libbacktrace/Makefile.am |  3 +-
>  libbacktrace/Makefile.in |  5 +--
>  libbacktrace/backtrace-vector.c  | 48 
> 
>  libbacktrace/dwarf.c |  4 +--
>  libbacktrace/internal.h  |  7 
>  libsanitizer/libbacktrace/Makefile.am|  1 +
>  libsanitizer/libbacktrace/Makefile.in| 12 ++-
>  libsanitizer/libbacktrace/backtrace-rename.h |  1 +
>  8 files changed, 74 insertions(+), 7 deletions(-)
>
> diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
> index 13e94f27aef..a267984e800 100644
> --- a/libbacktrace/Makefile.am
> +++ b/libbacktrace/Makefile.am
> @@ -47,7 +47,8 @@ libbacktrace_la_SOURCES = \
> posix.c \
> print.c \
> sort.c \
> -   state.c
> +   state.c \
> +   backtrace-vector.c
>
>  BACKTRACE_FILES = \
> backtrace.c \
> diff --git a/libbacktrace/Makefile.in b/libbacktrace/Makefile.in
> index 2d62ce20b9a..cc75c88c348 100644
> --- a/libbacktrace/Makefile.in
> +++ b/libbacktrace/Makefile.in
> @@ -152,7 +152,7 @@ CONFIG_CLEAN_VPATH_FILES =
>  LTLIBRARIES = $(noinst_LTLIBRARIES)
>  am__DEPENDENCIES_1 =
>  am_libbacktrace_la_OBJECTS = atomic.lo dwarf.lo fileline.lo posix.lo \
> -   print.lo sort.lo state.lo
> +   print.lo sort.lo state.lo backtrace-vector.lo
>  libbacktrace_la_OBJECTS = $(am_libbacktrace_la_OBJECTS)
>  AM_V_lt = $(am__v_lt_@AM_V@)
>  am__v_lt_ = $(am__v_lt_@AM_DEFAULT_V@)
> @@ -624,7 +624,8 @@ libbacktrace_la_SOURCES = \
> posix.c \
> print.c \
> sort.c \
> -   state.c
> +   state.c \
> +   backtrace-vector.c
>
>  BACKTRACE_FILES = \
> backtrace.c \
> diff --git a/libbacktrace/backtrace-vector.c b/libbacktrace/backtrace-vector.c
> new file mode 100644
> index 000..11dffd48184
> --- /dev/null
> +++ b/libbacktrace/backtrace-vector.c
> @@ -0,0 +1,48 @@
> +/* backtrace-vector.c -- Utility functions for backtrace_vector.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +
> +Redistribution and use in source and binary forms, with or without
> +modification, are permitted provided that the following conditions are
> +met:
> +
> +(1) Redistributions of source code must retain the above copyright
> +notice, this list of conditions and the following disclaimer.
> +
> +(2) Redistributions in binary form must reproduce the above copyright
> +notice, this list of conditions and the following disclaimer in
> +the documentation and/or other materials provided with the
> +distribution.
> +
> +(3) The name of the author may not be used to
> +endorse or promote products derived from this software without
> +specific prior written permission.
> +
> +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
> +IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
> +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
> +DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
> +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
> +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
> +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> +HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> +STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
> +IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> +POSSIBILITY OF SUCH DAMAGE.  */
> +
> +#include "config.h"
> +
> +#include 
> +#include 
> +
> +#include "backtrace.h"
> +#include "internal.h"
> +
> +void
> + backtrace_vector_free (struct backtrace_state 

[PATCH 5/5][libbacktrace] Reduce memory usage in build_address_map

2018-11-28 Thread Tom de Vries
Hi,

In build_address_map we allocate a unit, and then look for addresses in the
unit, which we store in the addrs vector, with the elements pointing to the
unit.  However, if we cannot find addresses in the unit, the allocated unit is
not used.

Fix this by detecting if the allocated unit has been used, and reusing it
otherwise.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[libbacktrace] Reduce memory usage in build_address_map

2018-11-28  Tom de Vries  

* dwarf.c (build_address_map): Reuse unused units.

---
 libbacktrace/dwarf.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index f843fab7529..ff97a20808c 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -1436,9 +1436,11 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
   size_t units_count;
   size_t i;
   struct unit **pu;
+  size_t prev_addrs_count;
 
   memset (>vec, 0, sizeof addrs->vec);
   addrs->count = 0;
+  prev_addrs_count = 0;
 
   /* Read through the .debug_info section.  FIXME: Should we use the
  .debug_aranges section?  gdb and addr2line don't use it, but I'm
@@ -1534,6 +1536,17 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
 
   if (unit_buf.reported_underflow)
goto fail;
+
+  if (addrs->count == prev_addrs_count)
+   {
+ --units_count;
+ units.size -= sizeof (u);
+ units.alc += sizeof (u);
+ free_abbrevs (state, >abbrevs, error_callback, data);
+ backtrace_free (state, u, sizeof *u, error_callback, data);
+   }
+  else
+   prev_addrs_count = addrs->count;
 }
   if (info.reported_underflow)
 goto fail;


[PATCH 4/5][libbacktrace] Simplify memory management in build_address_map

2018-11-28 Thread Tom de Vries
Hi,

In the main loop in build_address_map, we first read the abbrevs into a local
variable abbrevs, and then allocate the corresponding unit, after which we 
assign
the abbrevs to the unit.  This results in dedicated free-upon-failure
handling for the variable, and extra code to make sure that free-upon-failure
doesn't trigger once the unit has taken ownership of the abbrevs.

Simplify this by reversing the order of abbrev reading and unit allocation,
and eliminating the abbrevs local variable.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[libbacktrace] Simplify memory management in build_address_map

2018-11-28  Tom de Vries  

* dwarf.c (build_address_map): Simplify by removing local variable
abbrevs.

---
 libbacktrace/dwarf.c | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 8a7e94111ac..f843fab7529 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -1432,7 +1432,6 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
   void *data, struct unit_addrs_vector *addrs)
 {
   struct dwarf_buf info;
-  struct abbrevs abbrevs;
   struct backtrace_vector units;
   size_t units_count;
   size_t i;
@@ -1457,7 +1456,6 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
   memset (, 0, sizeof units);
   units_count = 0;
 
-  memset (, 0, sizeof abbrevs);
   while (info.left > 0)
 {
   const unsigned char *unit_data_start;
@@ -1488,13 +1486,6 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
  goto fail;
}
 
-  abbrev_offset = read_offset (_buf, is_dwarf64);
-  if (!read_abbrevs (state, abbrev_offset, dwarf_abbrev, dwarf_abbrev_size,
-is_bigendian, error_callback, data, ))
-   goto fail;
-
-  addrsize = read_byte (_buf);
-
   pu = ((struct unit **)
backtrace_vector_grow (state, sizeof (struct unit *),
   error_callback, data, ));
@@ -1509,6 +1500,14 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
   *pu = u;
   ++units_count;
 
+  memset (>abbrevs, 0, sizeof u->abbrevs);
+  abbrev_offset = read_offset (_buf, is_dwarf64);
+  if (!read_abbrevs (state, abbrev_offset, dwarf_abbrev, dwarf_abbrev_size,
+is_bigendian, error_callback, data, >abbrevs))
+   goto fail;
+
+  addrsize = read_byte (_buf);
+
   u->unit_data = unit_buf.buf;
   u->unit_data_len = unit_buf.left;
   u->unit_data_offset = unit_buf.buf - unit_data_start;
@@ -1519,8 +1518,6 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
   u->comp_dir = NULL;
   u->abs_filename = NULL;
   u->lineoff = 0;
-  u->abbrevs = abbrevs;
-  memset (, 0, sizeof abbrevs);
 
   /* The actual line number mappings will be read as needed.  */
   u->lines = NULL;
@@ -1559,7 +1556,6 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
}
   backtrace_vector_free (state, , error_callback, data);
 }
-  free_abbrevs (state, , error_callback, data);
   if (addrs->count > 0)
 {
   backtrace_vector_free (state, >vec, error_callback, data);


[PATCH 3/5][libbacktrace] Fix memory leak in loop in build_address_map

2018-11-28 Thread Tom de Vries
Hi,

When failing in build_address_map, we free the unit that's currently being
handled in the loop, but the ones that already have been allocated are leaked.

Fix this by keeping track of allocated units in a vector, and releasing them
upon failure.

Also, now that we have a vector of allocated units, move the freeing upon
failure of the abbrevs associated with each unit to build_address_map, and
remove the now redundant call to free_unit_addrs_vector.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[libbacktrace] Fix memory leak in loop in build_address_map

2018-11-28  Ian Lance Taylor  
Tom de Vries  

PR libbacktrace/88063
* dwarf.c (free_unit_addrs_vector): Remove.
(build_address_map): Keep track of allocated units in vector.  Free
allocated units and corresponding abbrevs upon failure.  Remove now
redundant call to free_unit_addrs_vector.  Free addrs vector upon
failure.  Free allocated unit vector.

---
 libbacktrace/dwarf.c | 60 +---
 1 file changed, 34 insertions(+), 26 deletions(-)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index b818911f5b4..8a7e94111ac 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -923,21 +923,6 @@ add_unit_addr (struct backtrace_state *state, uintptr_t 
base_address,
   return 1;
 }
 
-/* Free a unit address vector.  */
-
-static void
-free_unit_addrs_vector (struct backtrace_state *state,
-   struct unit_addrs_vector *vec,
-   backtrace_error_callback error_callback, void *data)
-{
-  struct unit_addrs *addrs;
-  size_t i;
-
-  addrs = (struct unit_addrs *) vec->vec.base;
-  for (i = 0; i < vec->count; ++i)
-free_abbrevs (state, [i].u->abbrevs, error_callback, data);
-}
-
 /* Compare unit_addrs for qsort.  When ranges are nested, make the
smallest one sort last.  */
 
@@ -1448,6 +1433,10 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
 {
   struct dwarf_buf info;
   struct abbrevs abbrevs;
+  struct backtrace_vector units;
+  size_t units_count;
+  size_t i;
+  struct unit **pu;
 
   memset (>vec, 0, sizeof addrs->vec);
   addrs->count = 0;
@@ -1465,6 +1454,9 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
   info.data = data;
   info.reported_underflow = 0;
 
+  memset (, 0, sizeof units);
+  units_count = 0;
+
   memset (, 0, sizeof abbrevs);
   while (info.left > 0)
 {
@@ -1503,10 +1495,20 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
 
   addrsize = read_byte (_buf);
 
+  pu = ((struct unit **)
+   backtrace_vector_grow (state, sizeof (struct unit *),
+  error_callback, data, ));
+  if (pu == NULL)
+ goto fail;
+
   u = ((struct unit *)
   backtrace_alloc (state, sizeof *u, error_callback, data));
   if (u == NULL)
goto fail;
+
+  *pu = u;
+  ++units_count;
+
   u->unit_data = unit_buf.buf;
   u->unit_data_len = unit_buf.left;
   u->unit_data_offset = unit_buf.buf - unit_data_start;
@@ -1531,27 +1533,33 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
dwarf_ranges, dwarf_ranges_size,
is_bigendian, error_callback, data,
u, addrs))
-   {
- free_abbrevs (state, >abbrevs, error_callback, data);
- backtrace_free (state, u, sizeof *u, error_callback, data);
- goto fail;
-   }
+   goto fail;
 
   if (unit_buf.reported_underflow)
-   {
- free_abbrevs (state, >abbrevs, error_callback, data);
- backtrace_free (state, u, sizeof *u, error_callback, data);
- goto fail;
-   }
+   goto fail;
 }
   if (info.reported_underflow)
 goto fail;
 
+  // We only kept the list of units to free them on failure.  On
+  // success the units are retained, pointed to by the entries in
+  // addrs.
+  backtrace_vector_free (state, , error_callback, data);
+
   return 1;
 
  fail:
+  if (units_count > 0)
+{
+  pu = (struct unit **) units.base;
+  for (i = 0; i < units_count; i++)
+   {
+ free_abbrevs (state, [i]->abbrevs, error_callback, data);
+ backtrace_free (state, pu[i], sizeof **pu, error_callback, data);
+   }
+  backtrace_vector_free (state, , error_callback, data);
+}
   free_abbrevs (state, , error_callback, data);
-  free_unit_addrs_vector (state, addrs, error_callback, data);
   if (addrs->count > 0)
 {
   backtrace_vector_free (state, >vec, error_callback, data);


[PATCH 1/5][libbacktrace] Factor out backtrace_vector_free

2018-11-28 Thread Tom de Vries
Hi,

this patch factors out new function backtrace_vector_free.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[libbacktrace] Factor out backtrace_vector_free

2018-11-28  Tom de Vries  

* Makefile.am (libbacktrace_la_SOURCES): Add backtrace-vector.
* Makefile.in: Regenerate.
* backtrace-vector.c: New file.
(backtrace_vector_free): New fuction, factored out of ...
* dwarf.c (read_line_info): ... here.
* internal.h (backtrace_vector_free): Declare.

* libbacktrace/Makefile.am (libsanitizer_libbacktrace_la_SOURCES): Add
backtrace-vector.c.
* libbacktrace/Makefile.in: Regenerate.
* libbacktrace/backtrace-rename.h (backtrace_vector_free): Define to
__asan_backtrace_vector_free.

---
 libbacktrace/Makefile.am |  3 +-
 libbacktrace/Makefile.in |  5 +--
 libbacktrace/backtrace-vector.c  | 48 
 libbacktrace/dwarf.c |  4 +--
 libbacktrace/internal.h  |  7 
 libsanitizer/libbacktrace/Makefile.am|  1 +
 libsanitizer/libbacktrace/Makefile.in| 12 ++-
 libsanitizer/libbacktrace/backtrace-rename.h |  1 +
 8 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
index 13e94f27aef..a267984e800 100644
--- a/libbacktrace/Makefile.am
+++ b/libbacktrace/Makefile.am
@@ -47,7 +47,8 @@ libbacktrace_la_SOURCES = \
posix.c \
print.c \
sort.c \
-   state.c
+   state.c \
+   backtrace-vector.c
 
 BACKTRACE_FILES = \
backtrace.c \
diff --git a/libbacktrace/Makefile.in b/libbacktrace/Makefile.in
index 2d62ce20b9a..cc75c88c348 100644
--- a/libbacktrace/Makefile.in
+++ b/libbacktrace/Makefile.in
@@ -152,7 +152,7 @@ CONFIG_CLEAN_VPATH_FILES =
 LTLIBRARIES = $(noinst_LTLIBRARIES)
 am__DEPENDENCIES_1 =
 am_libbacktrace_la_OBJECTS = atomic.lo dwarf.lo fileline.lo posix.lo \
-   print.lo sort.lo state.lo
+   print.lo sort.lo state.lo backtrace-vector.lo
 libbacktrace_la_OBJECTS = $(am_libbacktrace_la_OBJECTS)
 AM_V_lt = $(am__v_lt_@AM_V@)
 am__v_lt_ = $(am__v_lt_@AM_DEFAULT_V@)
@@ -624,7 +624,8 @@ libbacktrace_la_SOURCES = \
posix.c \
print.c \
sort.c \
-   state.c
+   state.c \
+   backtrace-vector.c
 
 BACKTRACE_FILES = \
backtrace.c \
diff --git a/libbacktrace/backtrace-vector.c b/libbacktrace/backtrace-vector.c
new file mode 100644
index 000..11dffd48184
--- /dev/null
+++ b/libbacktrace/backtrace-vector.c
@@ -0,0 +1,48 @@
+/* backtrace-vector.c -- Utility functions for backtrace_vector.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+(1) Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+
+(2) Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+
+(3) The name of the author may not be used to
+endorse or promote products derived from this software without
+specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.  */
+
+#include "config.h"
+
+#include 
+#include 
+
+#include "backtrace.h"
+#include "internal.h"
+
+void
+ backtrace_vector_free (struct backtrace_state *state,
+   struct backtrace_vector *vec,
+   backtrace_error_callback error_callback, void *data)
+{
+  vec->alc += vec->size;
+  vec->size = 0;
+  backtrace_vector_release (state, vec, error_callback, data);
+}
diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 4e93f120820..13d0aa4bcd8 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -2057,9 +2057,7 @@ read_line_info (struct backtrace_state *state, struct 
dwarf_data *ddata,
   return 1;
 
  fail:
-  vec.vec.alc += vec.vec.size;
-  vec.vec.size = 0;
-  backtrace_vector_release (state, , error_callback, data);
+  backtrace_vector_free (state, , 

[PATCH 2/5][libbacktrace] Fix memory leak in build_address_map

2018-11-28 Thread Tom de Vries
Hi,

While upon failure in build_address_map we call free_unit_addrs_vector, this
does not actually free the addrs vector, but merely the abbrevs of the units
pointed at by the elements of the addrs vector.

Fix this by adding code to build_address_map to make sure that the addrs vector
is freed upon failure.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[libbacktrace] Fix memory leak in build_address_map

2018-11-28  Tom de Vries  

* dwarf.c (build_address_map): Free addrs vector upon failure.

---
 libbacktrace/dwarf.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 13d0aa4bcd8..b818911f5b4 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -1552,6 +1552,11 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
  fail:
   free_abbrevs (state, , error_callback, data);
   free_unit_addrs_vector (state, addrs, error_callback, data);
+  if (addrs->count > 0)
+{
+  backtrace_vector_free (state, >vec, error_callback, data);
+  addrs->count = 0;
+}
   return 0;
 }
 


[Bug c/88065] [9 Regression] ICE in -Wsizeof-pointer-memaccess on an invalid strncpy

2018-11-28 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88065

Martin Sebor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Martin Sebor  ---
Fixed via r266594.

[Bug c/87297] [9 Regression] ICE on strncpy with an undeclared argument

2018-11-28 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87297

Martin Sebor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Martin Sebor  ---
Fixed via r266594.

Re: [PATCH] avoid error_mark_node in -Wsizeof-pointer-memaccess (PR 88065)

2018-11-28 Thread Martin Sebor

On 11/21/18 3:04 AM, Jakub Jelinek wrote:

On Tue, Nov 20, 2018 at 12:39:44AM +0100, Jakub Jelinek wrote:

On Mon, Nov 19, 2018 at 04:10:09PM -0700, Jeff Law wrote:

PR c/88065 - ICE in -Wsizeof-pointer-memaccess on an invalid strncpy

gcc/c-family/ChangeLog:

PR c/88065


Please also add
PR c/87297


Done.




* c-warn.c (sizeof_pointer_memaccess_warning): Bail if source
or destination is an error.

gcc/testsuite/ChangeLog:

PR c/88065
* gcc.dg/Wsizeof-pointer-memaccess2.c: New test.

This is probably OK.  But before final ACK, is there a point earlier
where we could/should have bailed out?


IMHO it is a good point, but it should use error_operand_p predicate instead
of == error_mark_node checks to also catch the case where the argument is
not error_mark_node, but has error_mark_node type.  And, the testcase
shouldn't be in gcc.dg, but in c-c++-common and cover also C++ testing.


Testcase proving that error_operand_p is really necessary:

/* PR c/87297 */
/* { dg-do compile } */
/* { dg-options "-Wsizeof-pointer-memaccess" } */
struct S { char a[4]; };

void
foo (struct S *p, const char *s)
{
   struct T x;  /* { dg-error "storage size|incomplete type" } */
   __builtin_strncpy (x, s, sizeof p->a);
}

Works in C, still ICEs in C++ even with the patch you've posted.


Thanks for the test case!  I didn't know about error_operand_p
or that this could happen.  I wonder how many other bugs there
are lurking in there because of it.


819   tree dstsz = TYPE_SIZE_UNIT (TREE_TYPE (d));
debug_tree (d)
  
 used decl_5 VOID huvaa.c:9:12
 align:8 warn_if_not_align:0 context 
 chain >

And, I think it is important to have these tests in c-c++-common, as the
above test shows, it behaves differently between C and C++ (C will present
error_mark_node itself rather than VAR_DECL with error_mark_node type) and
the code in question is just a helper for the FEs.


Given how easy it is to trip over the error type I have to agree.
I made the change to error_operand_p, moved the test and committed
r266594.

Martin


Re: [PATCH] PR libstdc++/67843 set shared_ptr lock policy at build-time

2018-11-28 Thread Hans-Peter Nilsson
> Date: Tue, 27 Nov 2018 23:25:55 +
> From: Jonathan Wakely 
> This resolves a longstanding issue where the lock policy for shared_ptr
> reference counting depends on compilation options when the header is
> included, so that different -march options can cause ABI
> changes.
> [...]

Thank you for pushing through and bringing this to a closure!

brgds, H-P


Re: [PATCH 5/6] [RS6000] Use standard call patterns for __tls_get_addr calls

2018-11-28 Thread Alan Modra
On Wed, Nov 28, 2018 at 07:32:50AM -0600, Segher Boessenkool wrote:
> On Wed, Nov 28, 2018 at 11:37:42AM +1030, Alan Modra wrote:
> > On Tue, Nov 27, 2018 at 10:29:29AM -0600, Segher Boessenkool wrote:
> > > Do you think we should to deprecate -mtls-markers in GCC 9?
> > 
> > Support for the TLS marker relocs was added to binutils in 2009 (git
> > commit 727fc41e077), so yes, the option is not likely to be useful
> > nowadays.
> 
> Well, but do we gain anything if it is eventually deleted?  Simpler code?

Not much.  Unfortunately we still need to keep TARGET_TLS_MARKERS (or
replace throughout with !HAVE_AS_TLS_MARKERS) for AIX.  Deprecate AIX
perhaps?  :-)

-- 
Alan Modra
Australia Development Lab, IBM


[Bug c/87297] [9 Regression] ICE on strncpy with an undeclared argument

2018-11-28 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87297

--- Comment #3 from Martin Sebor  ---
Author: msebor
Date: Wed Nov 28 23:04:09 2018
New Revision: 266594

URL: https://gcc.gnu.org/viewcvs?rev=266594=gcc=rev
Log:
PR c/88065 - ICE in -Wsizeof-pointer-memaccess on an invalid strncpy
PR c/87297 - ICE on strncpy with an undeclared argument

gcc/c-family/ChangeLog:

PR c/88065
PR c/87297
* c-warn.c (sizeof_pointer_memaccess_warning): Bail if source
or destination is an error.

gcc/testsuite/ChangeLog:

PR c/88065
PR c/87297
* c-c++-common/Wsizeof-pointer-memaccess4.c: New test.



Added:
trunk/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess4.c
Modified:
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-warn.c
trunk/gcc/testsuite/ChangeLog

[Bug c/88065] [9 Regression] ICE in -Wsizeof-pointer-memaccess on an invalid strncpy

2018-11-28 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88065

--- Comment #3 from Martin Sebor  ---
Author: msebor
Date: Wed Nov 28 23:04:09 2018
New Revision: 266594

URL: https://gcc.gnu.org/viewcvs?rev=266594=gcc=rev
Log:
PR c/88065 - ICE in -Wsizeof-pointer-memaccess on an invalid strncpy
PR c/87297 - ICE on strncpy with an undeclared argument

gcc/c-family/ChangeLog:

PR c/88065
PR c/87297
* c-warn.c (sizeof_pointer_memaccess_warning): Bail if source
or destination is an error.

gcc/testsuite/ChangeLog:

PR c/88065
PR c/87297
* c-c++-common/Wsizeof-pointer-memaccess4.c: New test.



Added:
trunk/gcc/testsuite/c-c++-common/Wsizeof-pointer-memaccess4.c
Modified:
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-warn.c
trunk/gcc/testsuite/ChangeLog

Re: [PATCH] Fix PR68212, unrolled loop no longer aligned

2018-11-28 Thread Jan Hubicka
> 2018-11-28  Pat Haugen  
> 
> PR rtl-optimization/68212
> * cfgloopmanip.c (duplicate_loop_to_header_edge): Adjust scale factor.
> 
> testsuite/ChangeLog:
> 2018-11-28  Pat Haugen  
> 
> PR rtl-optimization/68212
> * gcc.dg/pr68212.c: New test.
> 
> 
> 
> Index: gcc/cfgloopmanip.c
> ===
> --- gcc/cfgloopmanip.c(revision 266522)
> +++ gcc/cfgloopmanip.c(working copy)
> @@ -1242,16 +1242,25 @@ duplicate_loop_to_header_edge (struct lo
> profile_probability prob_pass_main = bitmap_bit_p (wont_exit, 0)
>   ? prob_pass_wont_exit
>   : prob_pass_thru;
> -   profile_probability p = prob_pass_main;
> -   profile_count scale_main_den = count_in;
> -   for (i = 0; i < ndupl; i++)
> +
> +   /* If not using real profile data then don't scale the loop by ndupl.
> +  This can lead to the loop no longer looking hot wrt surrounding
> +  code.  */
> +   if (profile_status_for_fn (cfun) == PROFILE_GUESSED)
> + scale_main = profile_probability::likely ();

profile_status_for_fn is not very informative when you inline FDO to
non-FDO code (via LTO or when profile was lost for COMDAT).
You want to test count_in.reliable_p () which will return true for FDO
profiles but not for guessed profiles and Auto-FDO.

If we know number of iterations at compile time (which is relatively
frequent for unrolled loops) we actually put in correct profile with a
cap given by --param max-predicted-iterations (which may make sense to
get higher).  In that case we probably do not want to drop this
information from profile.

Can't we get to scale only if the profile looks wrong - count_in
gets unrealistically small compared to cfgloop expected number of
iterations info says (or when it is absent)?

Honza


Re: [PATCH v3] Make function clone name numbering independent.

2018-11-28 Thread Segher Boessenkool
Hi!

On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote:
> I've also included a small change to rs6000 which I'm pretty sure is
> safe, but I have no way of testing.

Do you have an account on the GCC Compile Farm?
https://cfarm.tetaneutral.net/
There are some nice Power machines in there!

Does this patch dependent on the rest of the series?

If it tests okay, it is okay for trunk of course.  Thanks!

One comment about your changelog:

> 2018-11-28  Michael Ploujnikov  
> 
>   * config/rs6000/rs6000.c (make_resolver_func): Generate
> resolver symbol with clone_function_name instead of
> clone_function_name_numbered.

Those last two lines should not start with the spaces.  It should be:

2018-11-28  Michael Ploujnikov  

* config/rs6000/rs6000.c (make_resolver_func): Generate
resolver symbol with clone_function_name instead of
clone_function_name_numbered.


Segher


Re: [PATCH, V2, d] Fix IdentityExp comparison for complex floats

2018-11-28 Thread Iain Buclaw
On Wed, 28 Nov 2018 at 22:32, Johannes Pfau  wrote:
>
> Next version, addresses the review comments.
>
> Tested at https://github.com/D-Programming-GDC/GDC/pull/768
> ---
> gcc/d/ChangeLog:
>
> 2018-11-28  Johannes Pfau  
>
> * expr.cc (ExprVisitor::visit(IdentityExp)): Add support for complex 
> types.
> (build_float_identity): New function.
>
> gcc/testsuite/ChangeLog:
>
> 2018-11-28  Johannes Pfau  
>
> * gdc.dg/runnable.d: Test IdentityExp for complex types.
>
>
>  gcc/d/expr.cc   | 40 -
>  gcc/testsuite/gdc.dg/runnable.d | 22 ++
>  2 files changed, 51 insertions(+), 11 deletions(-)
>

As I've said before, looks reasonable to me.  Thanks.

-- 
Iain


Re: [testsuite] Require ucn support in gdc.test/compilable/ddoc12.d (PR d/88039)

2018-11-28 Thread Iain Buclaw
On Tue, 27 Nov 2018 at 20:32, Rainer Orth  wrote:
>
> Hi Mike,
>
> > On Nov 27, 2018, at 2:18 AM, Rainer Orth  
> > wrote:
> >>
> >> Some assemblers, including the Solaris one, don't support UTF-8
> >> identifiers, which breaks the gdc.test/compilable/ddoc12.d testcase as
> >> reported in the PR.
> >
> > So, another style of fix, would be to change the binding from the language
> > front-end to encode unsupported symbols using a fixed, documented, abi
> > defined technique.
>
> there's been some discussion on this in the PR.  Joseph's suggestion was
> to follow the system compilers if this were done, and indeed they do
> encode UTF-8 identifiers in some way which could either be
> reverse-engineered or a spec obtained.  However, given Iain's stance
> towards UTF-8 identifiers in D, I very much doubt this is worth the
> effort.  Ultimately, it's his call, of course.
>

Not only my stance, but as a whole just how those maintaining the core
language generally agree on. Encoding UTF8 characters in symbols is
not part of the D ABI, so that is something that needs convincing
upstream.

There is a third way however, all compilable/ddoc* tests don't
actually require us to compile the module all the way down to object
code, the only thing that really needs to be tested is the Ddoc
generator itself.  Which would be setting 'dg-do-what compile' and
build with the compiler option -fdoc, then dg-final checks for the
presence of the file ddoc12.html is the minimum that needs to be done
for these tests to be considered passed.

I'll have a look into doing it that way tomorrow.

--
Iain


Re: Gimpel lowering question.

2018-11-28 Thread Andrew Pinski
On Wed, Nov 28, 2018 at 9:47 AM Michael Eager  wrote:
>
> On 11/28/18 09:10, Jeff Law wrote:
> > On 11/28/18 10:00 AM, Michael Eager wrote:
> >> I have a small test case which generates poor quality code on my target.
> >> Here is the original:
> >>
> >>if (cond1 == 2048 || cond2 == 8)
> >>  {
> >>x = x + y;
> >>  }
> >>return x;
> >>
> >> This ends up generating a series of instructions to compute a flag with
> >> the result of the condition followed by a single compare with zero and
> >> a jump.  Better code would be two compares and two jumps.
> >>
> >> The gimple is
> >>
> >>_1 = cond1 == 2048;
> >>_2 = cond2 == 8;
> >>_3 = _1 | _2;
> >>if (_3 != 0) goto ; else goto ;
> >> ...
> >>
> >> so this poor code sequence is essentially a correct translation.
> >>
> >> On MIPS, for the same test case, the gimple is different:
> >>
> >>if (cond1 == 2048) goto ; else goto ;
> >>:
> >>if (cond2 == 8) goto ; else goto ;
> >>:
> >> ...
> >>
> >> which generates the better code sequence.
> >>
> >> Can someone give me a pointer where to find where the lowering
> >> pass decides to break up a condition into multiple tests?  Is
> >> there a target configuration option which I have overlooked or
> >> maybe set incorrectly?
> > BRANCH_COST, which comes into play during generation of the initial
> > trees as well in passes which try to optimize branchy code into
> > straighter code.
>
> Thanks.  I did look at BRANCH_COST, and played with the values, but it
> didn't seem to have any affect.  I'll take a closer look.

Look at LOGICAL_OP_NON_SHORT_CIRCUIT .  By defualt, it is defined as:
  (BRANCH_COST (optimize_function_for_speed_p (cfun), \
false) >= 2)
But MIPS defines it as 0.

Changing  LOGICAL_OP_NON_SHORT_CIRCUIT to 0 will disable this optimization.

LOGICAL_OP_NON_SHORT_CIRCUIT controls both places where (cond1 == 2048
|| cond2 == 8) would remove the branch.

NOTE I think MIPS defines LOGICAL_OP_NON_SHORT_CIRCUIT incorrectly but
that is a different story.

Thanks,
Andrew Pinski


>
> --
> Michael Eagerea...@eagerm.com
> 1960 Park Blvd., Palo Alto, CA 94306


Re: [RS6000] PR11848 rs6000_emit_move long double split

2018-11-28 Thread Segher Boessenkool
On Sun, Nov 25, 2018 at 10:52:08PM +1030, Alan Modra wrote:
> This split is disabled for power7 and up, so we don't often see its
> bad effects.  However, on a powerpc-linux compiler (which defaults
> to PPC750 judging from rs6000/sysv4.h) we see
> 
>   long double ld1 (void) { return 1.0L; }
> 
> compiled with -msoft-float -O2 -S resulting in
> 
>   ld1:
>   li 3,0
>   li 4,0
>   mr 6,4
>   mr 5,3
>   li 4,0
>   lis 3,0x3ff0
>   blr
> 
> Things go awry in init-regs, with the TFmode reg being initialized to
> zero on seeing the subreg from the split.  (And that initialization
> itself is split by rs6000_emit_move!)  Later passes apparently don't
> clean up the rubbish.
> 
> Since the split was added for Darwin (as the comment says), let's get
> rid of it on other targets.
> 
> Bootstrapped etc. powerpc64le-linux and powerpc64-linux.
> 
>   * config/rs6000/rs6000.c (rs6000_emit_move): Disable long
>   double split for targets other than Darwin.

Okay for trunk, thanks!


Segher


Re: C++ PATCH to implement C++20 P0634R3, Down with typename!

2018-11-28 Thread Jason Merrill

On 11/12/18 10:27 AM, Marek Polacek wrote:

This patch implements C++20 P0634R3, Down with typename!

which makes 'typename' optional in several contexts specified in [temp.res].

The gist of the patch is in cp_parser_simple_type_specifier, where, if the
context makes typename optional and the id is qualified, we pretend we've
seen the typename keyword.


Sounds good.


There's quite a lot of churn because we need to be careful where we want
to make typename optional, and e.g. a flag in cp_parser would be too global.


Did you consider adding a cp_parser_flags parameter rather than a 
specific bool?  I don't feel strongly about it, but it would simplify 
things the next time we want to do something like this.



The resolve_typename_type hunk was to make typename9.C work with -fconcepts.


Makes sense.


+   const bool typename_optional_p = (cxx_dialect >= cxx2a);


I'm uncomfortable with repeating this in lots of places in the parser; 
I'd prefer to have one place where we control whether the feature is 
enabled or not.  This seems like the right place:



+ bool typename_p = (flags & CP_PARSER_FLAGS_TYPENAME_OPTIONAL);
+ type = cp_parser_type_name (parser, (qualified_p && typename_p));



I'm not sure about some of the bits in typename5.C, not quite sure if the
code is valid, but I didn't have time to investigate deeply and it seems
pretty obscure anyway.  There are preexisting cases when g++ and clang++
disagree.



+  // ??? Not sure if these are (in)valid.


I think these are all valid, since the qualified-ids all appear as a 
decl-specifier of a member-declaration.



+  typedef typename A::template N a1;
+  typedef typename A::template N a2;
+  typename A::template N a3;
+  typename A::template N a4;
+  typedef A::N a5; // { dg-error "not a type|invalid" }
+  typedef A::N a6; // { dg-error "not a type|invalid" }


The ::template shouldn't be required here anymore: [temp.names] "In a 
qualified-id used as the name in a typename-specifier (12.7), 
elaborated-type-specifier (9.1.7.3), using-declaration (9.8), or 
class-or-decltype (10.6), an optional keyword template appearing at the 
top level is ignored. In these contexts, a < token is always assumed to 
introduce a template-argument-list."


This is C++17 DR 1710.  You don't need to implement it in this patch, 
but let's not test for the wrong behavior.  :)


Jason


[Bug fortran/87937] [8 Regression] LHS reallocation broken inside "select type" and "associate"

2018-11-28 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87937

Dominique d'Humieres  changed:

   What|Removed |Added

  Known to work||8.2.0, 9.0
Summary|[8/9 Regression] LHS|[8 Regression] LHS
   |reallocation broken inside  |reallocation broken inside
   |"select type" and   |"select type" and
   |"associate" |"associate"
  Known to fail|8.2.0   |8.2.1

--- Comment #14 from Dominique d'Humieres  ---
> Backporting r264725 on top of the current gcc-8 branch fixes the issue for me.

I changed the summary that the problem is fixed on trunk.

[Bug d/87816] D runtime fails to build on aarch64

2018-11-28 Thread ibuclaw at gdcproject dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87816

Iain Buclaw  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Iain Buclaw  ---
Last library fixes were merged in r266572.

However it is still required to enable always building libphobos when
--enable-languages=d on aarch64-linux.

Re: [RS6000] movdi_internal64 insn lengths

2018-11-28 Thread Segher Boessenkool
On Sun, Nov 25, 2018 at 10:47:53PM +1030, Alan Modra wrote:
> Fixes alternatives wi<-Oj (xxlxor), wi<-wM (xxlorc) and wv<-wS.
> Bootstrapped etc. powerpc64le-linux.

Okay for trunk, thanks!  (For the record, wv<-wS is xxsltib+extend).


Segher


>   * config/rs6000/rs6000.md (movdi_internal64): Correct lengths.


[PATCH, libphobos] Committed merge common version blocks using arch_any conditions

2018-11-28 Thread Iain Buclaw
Hi,

This patch is backported from druntime master.  It sets a precedence
in upstream for merging architecture agnostic C bindings into one
block, rather than separate duplicated blocks.

A nice side-effect is it almost completes the C bindings for
s390-linux-gnu and sparc-linux-gnu, and fixes a bug on MIPS32 where
O_SYNC had the wrong value.

Boostrapped and ran D2 testsuite on x86_64-linux-gnu and aarch64-linux-gnu.

Committed to trunk as r266593.

-- 
Iain
---
diff --git a/libphobos/libdruntime/core/stdc/errno.d b/libphobos/libdruntime/core/stdc/errno.d
index 8f9b64a2dce..eaf4867a942 100644
--- a/libphobos/libdruntime/core/stdc/errno.d
+++ b/libphobos/libdruntime/core/stdc/errno.d
@@ -23,6 +23,21 @@ else version (TVOS)
 else version (WatchOS)
 version = Darwin;
 
+version (ARM) version = ARM_Any;
+version (AArch64) version = ARM_Any;
+version (MIPS32)  version = MIPS_Any;
+version (MIPS64)  version = MIPS_Any;
+version (PPC) version = PPC_Any;
+version (PPC64)   version = PPC_Any;
+version (RISCV32) version = RISCV_Any;
+version (RISCV64) version = RISCV_Any;
+version (S390)version = IBMZ_Any;
+version (SPARC)   version = SPARC_Any;
+version (SPARC64) version = SPARC_Any;
+version (SystemZ) version = IBMZ_Any;
+version (X86) version = X86_Any;
+version (X86_64)  version = X86_Any;
+
 @trusted: // Only manipulates errno.
 nothrow:
 @nogc:
@@ -200,213 +215,7 @@ else version (linux)
 enum EDOM   = 33; ///
 enum ERANGE = 34; ///
 
-version (X86)
-{
-enum EDEADLK= 35; ///
-enum ENAMETOOLONG   = 36; ///
-enum ENOLCK = 37; ///
-enum ENOSYS = 38; ///
-enum ENOTEMPTY  = 39; ///
-enum ELOOP  = 40; ///
-enum EWOULDBLOCK= EAGAIN; ///
-enum ENOMSG = 42; ///
-enum EIDRM  = 43; ///
-enum ECHRNG = 44; ///
-enum EL2NSYNC   = 45; ///
-enum EL3HLT = 46; ///
-enum EL3RST = 47; ///
-enum ELNRNG = 48; ///
-enum EUNATCH= 49; ///
-enum ENOCSI = 50; ///
-enum EL2HLT = 51; ///
-enum EBADE  = 52; ///
-enum EBADR  = 53; ///
-enum EXFULL = 54; ///
-enum ENOANO = 55; ///
-enum EBADRQC= 56; ///
-enum EBADSLT= 57; ///
-enum EDEADLOCK  = EDEADLK;///
-enum EBFONT = 59; ///
-enum ENOSTR = 60; ///
-enum ENODATA= 61; ///
-enum ETIME  = 62; ///
-enum ENOSR  = 63; ///
-enum ENONET = 64; ///
-enum ENOPKG = 65; ///
-enum EREMOTE= 66; ///
-enum ENOLINK= 67; ///
-enum EADV   = 68; ///
-enum ESRMNT = 69; ///
-enum ECOMM  = 70; ///
-enum EPROTO = 71; ///
-enum EMULTIHOP  = 72; ///
-enum EDOTDOT= 73; ///
-enum EBADMSG= 74; ///
-enum EOVERFLOW  = 75; ///
-enum ENOTUNIQ   = 76; ///
-enum EBADFD = 77; ///
-enum EREMCHG= 78; ///
-enum ELIBACC= 79; ///
-enum ELIBBAD= 80; ///
-enum ELIBSCN= 81; ///
-enum ELIBMAX= 82; ///
-enum ELIBEXEC   = 83; ///
-enum EILSEQ = 84; ///
-enum ERESTART   = 85; ///
-enum ESTRPIPE   = 86; ///
-enum EUSERS = 87; ///
-enum ENOTSOCK   = 88; ///
-enum EDESTADDRREQ   = 89; ///
-enum EMSGSIZE   = 90; ///
-enum EPROTOTYPE = 91; ///
-enum ENOPROTOOPT= 92; ///
-enum EPROTONOSUPPORT= 93; ///
-enum ESOCKTNOSUPPORT= 94; ///
-enum EOPNOTSUPP = 95; ///
-enum ENOTSUP= EOPNOTSUPP; ///
-enum EPFNOSUPPORT   = 96; ///
-enum EAFNOSUPPORT   = 97; ///
-enum EADDRINUSE = 98; ///
-enum EADDRNOTAVAIL  = 99; ///
-enum ENETDOWN   = 100;///
-enum ENETUNREACH= 101;

RFA: a x86 test modification

2018-11-28 Thread Vladimir Makarov

  The patch I committed today recently for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88207

  creates a new regression for pr34256.c.  2 moves is expected but gcc 
with the patch generates 3 moves.  I think now RA generates the right code.


We have the following code before RA

(insn 7 6 13 2 (set (reg:V2SI 88)
    (plus:V2SI (reg:V2SI 89 [ x ])
    (mem/c:V2SI (symbol_ref:DI ("y") [flags 0x2]  0x7faad08b5b40 y>) [1 y+0 S8 A64])))"mmintrin.h":306:18 1115 {*mmx_addv2si3}

 (expr_list:REG_DEAD (reg:V2SI 89 [ x ])
    (nil)))
(insn 13 7 14 2 (set (reg/i:DI 0 ax)
    (subreg:DI (reg:V2SI 88) 0)) "pr34256.c":11:1 66 {*movdi_internal}
 (expr_list:REG_DEAD (reg:V2SI 88)

The test is expected to assign mmx reg to pseudo 88 but gcc with the 
patch assigns memory to it.  The cost of mmx to general reg move is 13, 
while overall cost of mmx to mem and mem to general moves is 10.  So IRA 
now chooses memory for pseudo 88 according to the minimal cost.


Now, if we want still assign mmx reg to the pseudo 88, we should change 
the costs in machine-dependent x86 code.  But I think it might create 
other unexpected code generation.  As mmx is basically not used nowadays 
the test is not important, I just propose the following patch.


Is it ok for the trunk?



Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 266582)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2018-11-28  Vladimir Makarov  
+
+	* gcc.target/i386/pr34256.c: Adjust the number of expected moves.
+
 2018-11-28  Marek Polacek  
 
 	PR c++/88222 - ICE with bit-field with invalid type.
Index: testsuite/gcc.target/i386/pr34256.c
===
--- testsuite/gcc.target/i386/pr34256.c	(revision 266155)
+++ testsuite/gcc.target/i386/pr34256.c	(working copy)
@@ -10,5 +10,5 @@ unsigned long long  foo(__m64 m) {
   return _mm_cvtm64_si64(_mm_add_pi32(x, y));
 }
 
-/* { dg-final { scan-assembler-times "mov" 2 { target { nonpic || pie_enabled } } } } */
-/* { dg-final { scan-assembler-times "mov" 4 { target { { ! nonpic } && { ! pie_enabled } } } } } */
+/* { dg-final { scan-assembler-times "mov" 3 { target { nonpic || pie_enabled } } } } */
+/* { dg-final { scan-assembler-times "mov" 5 { target { { ! nonpic } && { ! pie_enabled } } } } } */


[PATCH 1/2, d] Fix hashing of complex reals

2018-11-28 Thread Johannes Pfau
Hashing of complex types where the floating point type used
for the real and imaginary parts has padding (such as X86 80 bit reals)
has padding, is currently broken in druntime.

Fixed by backporting https://github.com/dlang/druntime/pull/2356
from druntime commit 29ce0543cb62229f005b2bc8540416dbccd1130e

Tested at https://github.com/D-Programming-GDC/GDC/pull/768

--
Johannes

---
libphobos/ChangeLog:

2018-11-28  Johannes Pfau  

* libdruntime/core/internal/convert.d: Backport from latest druntime.
* libdruntime/core/internal/hash.d: Likewise.
* libdruntime/core/internal/traits.d: Likewise.
* libdruntime/rt/util/typeinfo.d: Likewise.

 libphobos/libdruntime/core/internal/convert.d |  136 ++-
 libphobos/libdruntime/core/internal/hash.d| 1044 +++--
 libphobos/libdruntime/core/internal/traits.d  |   19 +
 libphobos/libdruntime/rt/util/typeinfo.d  |   33 +-
 4 files changed, 815 insertions(+), 417 deletions(-)

diff --git a/libphobos/libdruntime/core/internal/convert.d 
b/libphobos/libdruntime/core/internal/convert.d
index c4745b6f5a7..76d6f013b9d 100644
--- a/libphobos/libdruntime/core/internal/convert.d
+++ b/libphobos/libdruntime/core/internal/convert.d
@@ -3,7 +3,7 @@
  * This module provides functions to converting different values to 
const(ubyte)[]
  *
  * Copyright: Copyright Igor Stepanov 2013-2013.
- * License:   $(WEB www.boost.org/LICENSE_1_0.txt, Boost License 1.0).
+ * License:   $(HTTP www.boost.org/LICENSE_1_0.txt, Boost License 1.0).
  * Authors:   Igor Stepanov
  * Source: $(DRUNTIMESRC core/internal/_convert.d)
  */
@@ -33,7 +33,7 @@ private ubyte[] ctfe_alloc()(size_t n)
 }
 }
 
-@trusted pure nothrow
+@trusted pure nothrow @nogc
 const(ubyte)[] toUbyte(T)(const ref T val) if (is(Unqual!T == float) || 
is(Unqual!T == double) || is(Unqual!T == real) ||
 is(Unqual!T == ifloat) || is(Unqual!T 
== idouble) || is(Unqual!T == ireal))
 {
@@ -72,7 +72,7 @@ const(ubyte)[] toUbyte(T)(const ref T val) if (is(Unqual!T == 
float) || is(Unqua
 ulong mantissa2 = parsed.mantissa2;
 off_bytes--; // go back one, since mantissa only stored data in 56
  // bits, ie 7 bytes
-for(; off_bytes < FloatTraits!T.MANTISSA/8; ++off_bytes)
+for (; off_bytes < FloatTraits!T.MANTISSA/8; ++off_bytes)
 {
 buff[off_bytes] = cast(ubyte)mantissa2;
 mantissa2 >>= 8;
@@ -114,13 +114,13 @@ const(ubyte)[] toUbyte(T)(const ref T val) if 
(is(Unqual!T == float) || is(Unqua
 }
 }
 
-@safe pure nothrow
+@safe pure nothrow @nogc
 private Float parse(bool is_denormalized = false, T)(T x) if (is(Unqual!T == 
ifloat) || is(Unqual!T == idouble) || is(Unqual!T == ireal))
 {
 return parse(x.im);
 }
 
-@safe pure nothrow
+@safe pure nothrow @nogc
 private Float parse(bool is_denormalized = false, T:real)(T x_) if 
(floatFormat!T != FloatFormat.Real80)
 {
 Unqual!T x = x_;
@@ -178,7 +178,7 @@ private Float parse(bool is_denormalized = false, T:real)(T 
x_) if (floatFormat!
 }
 }
 
-@safe pure nothrow
+@safe pure nothrow @nogc
 private Float parse(bool _ = false, T:real)(T x_) if (floatFormat!T == 
FloatFormat.Real80)
 {
 Unqual!T x = x_;
@@ -232,6 +232,7 @@ private struct Float
 
 private template FloatTraits(T) if (floatFormat!T == FloatFormat.Float)
 {
+enum DATASIZE = 4;
 enum EXPONENT = 8;
 enum MANTISSA = 23;
 enum ZERO = Float(0, 0, 0);
@@ -244,6 +245,7 @@ private template FloatTraits(T) if (floatFormat!T == 
FloatFormat.Float)
 
 private template FloatTraits(T) if (floatFormat!T == FloatFormat.Double)
 {
+enum DATASIZE = 8;
 enum EXPONENT = 11;
 enum MANTISSA = 52;
 enum ZERO = Float(0, 0, 0);
@@ -256,6 +258,7 @@ private template FloatTraits(T) if (floatFormat!T == 
FloatFormat.Double)
 
 private template FloatTraits(T) if (floatFormat!T == FloatFormat.Real80)
 {
+enum DATASIZE = 10;
 enum EXPONENT = 15;
 enum MANTISSA = 64;
 enum ZERO = Float(0, 0, 0);
@@ -268,6 +271,7 @@ private template FloatTraits(T) if (floatFormat!T == 
FloatFormat.Real80)
 
 private template FloatTraits(T) if (floatFormat!T == FloatFormat.DoubleDouble) 
//Unsupported in CTFE
 {
+enum DATASIZE = 16;
 enum EXPONENT = 11;
 enum MANTISSA = 106;
 enum ZERO = Float(0, 0, 0);
@@ -280,6 +284,7 @@ private template FloatTraits(T) if (floatFormat!T == 
FloatFormat.DoubleDouble) /
 
 private template FloatTraits(T) if (floatFormat!T == FloatFormat.Quadruple)
 {
+enum DATASIZE = 16;
 enum EXPONENT = 15;
 enum MANTISSA = 112;
 enum ZERO = Float(0, 0, 0);
@@ -291,10 +296,10 @@ private template FloatTraits(T) if (floatFormat!T == 
FloatFormat.Quadruple)
 }
 
 
-@safe pure nothrow
+@safe pure nothrow @nogc
 private real binPow2(int pow)
 {
-static real binPosPow2(int pow) @safe pure nothrow
+static real binPosPow2(int pow) @safe pure nothrow 

[PATCH 2/2, d] Enable tests for rt.util.typeinfo and core.internal.convert

2018-11-28 Thread Johannes Pfau
With these backports, these tests now pass for GDC and we don't
need the special cases in the Makefiles anymore.

--
Johannes

---
libphobos/ChangeLog:

2018-11-28  Johannes Pfau  

* libdruntime/Makefile.am: Test rt.util.typeinfo and 
core.internal.convert.
* libdruntime/Makefile.in: Rebuild.


 libphobos/libdruntime/Makefile.am | 12 ++--
 libphobos/libdruntime/Makefile.in | 12 ++--
 2 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/libphobos/libdruntime/Makefile.am 
b/libphobos/libdruntime/Makefile.am
index e14cc031eea..e8a4691ebd2 100644
--- a/libphobos/libdruntime/Makefile.am
+++ b/libphobos/libdruntime/Makefile.am
@@ -79,16 +79,8 @@ ALL_DRUNTIME_COMPILE_DSOURCES += 
$(DRUNTIME_DSOURCES_GENERATED)
 
 ALL_DRUNTIME_SOURCES = $(ALL_DRUNTIME_COMPILE_DSOURCES) $(DRUNTIME_CSOURCES) \
$(DRUNTIME_SSOURCES)
-REAL_DRUNTIME_TEST_LOBJECTS = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.lo)
-REAL_DRUNTIME_TEST_OBJECTS  = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.o)
-# Workaround issue #
-DRUNTIME_TEST_OBJECTS  = $(filter-out rt/util/typeinfo.t.o \
-   core/internal/convert.t.o, $(REAL_DRUNTIME_TEST_OBJECTS)) \
-   rt/util/typeinfo.o core/internal/convert.o
-
-DRUNTIME_TEST_LOBJECTS  = $(filter-out rt/util/typeinfo.t.lo \
-   core/internal/convert.t.lo, $(REAL_DRUNTIME_TEST_LOBJECTS)) \
-   rt/util/typeinfo.lo core/internal/convert.lo
+DRUNTIME_TEST_LOBJECTS = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.lo)
+DRUNTIME_TEST_OBJECTS  = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.o)
 
 
 # Main library build definitions
diff --git a/libphobos/libdruntime/Makefile.in 
b/libphobos/libdruntime/Makefile.in
index d07a8f98099..c0ca4385a21 100644
--- a/libphobos/libdruntime/Makefile.in
+++ b/libphobos/libdruntime/Makefile.in
@@ -744,16 +744,8 @@ ALL_DRUNTIME_COMPILE_DSOURCES = $(DRUNTIME_DSOURCES) 
$(am__append_1) \
 ALL_DRUNTIME_SOURCES = $(ALL_DRUNTIME_COMPILE_DSOURCES) $(DRUNTIME_CSOURCES) \
$(DRUNTIME_SSOURCES)
 
-REAL_DRUNTIME_TEST_LOBJECTS = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.lo)
-REAL_DRUNTIME_TEST_OBJECTS = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.o)
-# Workaround issue #
-DRUNTIME_TEST_OBJECTS = $(filter-out rt/util/typeinfo.t.o \
-   core/internal/convert.t.o, $(REAL_DRUNTIME_TEST_OBJECTS)) \
-   rt/util/typeinfo.o core/internal/convert.o
-
-DRUNTIME_TEST_LOBJECTS = $(filter-out rt/util/typeinfo.t.lo \
-   core/internal/convert.t.lo, $(REAL_DRUNTIME_TEST_LOBJECTS)) \
-   rt/util/typeinfo.lo core/internal/convert.lo
+DRUNTIME_TEST_LOBJECTS = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.lo)
+DRUNTIME_TEST_OBJECTS = $(ALL_DRUNTIME_COMPILE_DSOURCES:.d=.t.o)
 
 @ENABLE_SHARED_TRUE@check_LTLIBRARIES = libgdruntime_t.la
 toolexeclib_LTLIBRARIES = libgdruntime.la
-- 
2.19.1



[PATCH, V2, d] Fix IdentityExp comparison for complex floats

2018-11-28 Thread Johannes Pfau
Next version, addresses the review comments.

Tested at https://github.com/D-Programming-GDC/GDC/pull/768
---
gcc/d/ChangeLog:

2018-11-28  Johannes Pfau  

* expr.cc (ExprVisitor::visit(IdentityExp)): Add support for complex 
types.
(build_float_identity): New function. 

gcc/testsuite/ChangeLog:

2018-11-28  Johannes Pfau  

* gdc.dg/runnable.d: Test IdentityExp for complex types.


 gcc/d/expr.cc   | 40 -
 gcc/testsuite/gdc.dg/runnable.d | 22 ++
 2 files changed, 51 insertions(+), 11 deletions(-)

diff --git a/gcc/d/expr.cc b/gcc/d/expr.cc
index 9a1aad42ddc..91cb02f1e9a 100644
--- a/gcc/d/expr.cc
+++ b/gcc/d/expr.cc
@@ -42,6 +42,20 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "d-tree.h"
 
+/* Helper function for floating point identity comparison. Compare
+   only well-defined bits, ignore padding (e.g. for X86 80bit real).  */
+
+static tree build_float_identity (tree_code code, tree t1, tree t2)
+{
+  /* For floating-point values, identity is defined as the bits in the
+ operands being identical.  */
+  tree tmemcmp = builtin_decl_explicit (BUILT_IN_MEMCMP);
+  tree size = size_int (TYPE_PRECISION (TREE_TYPE (t1)) / BITS_PER_UNIT);
+
+  tree result = build_call_expr (tmemcmp, 3, build_address (t1),
+build_address (t2), size);
+  return build_boolop (code, result, integer_zero_node);
+}
 
 /* Implements the visitor interface to build the GCC trees of all Expression
AST classes emitted from the D Front-end.
@@ -275,19 +289,23 @@ public:
this->result_ = d_convert (build_ctype (e->type),
   build_boolop (code, t1, t2));
   }
-else if (tb1->isfloating () && tb1->ty != Tvector)
+else if (tb1->iscomplex () && tb1->ty != Tvector)
   {
-   /* For floating-point values, identity is defined as the bits in the
-  operands being identical.  */
-   tree t1 = d_save_expr (build_expr (e->e1));
-   tree t2 = d_save_expr (build_expr (e->e2));
-
-   tree tmemcmp = builtin_decl_explicit (BUILT_IN_MEMCMP);
-   tree size = size_int (TYPE_PRECISION (TREE_TYPE (t1)) / BITS_PER_UNIT);
+   tree e1 = d_save_expr (build_expr (e->e1));
+   tree e2 = d_save_expr (build_expr (e->e2));
+   tree req = build_float_identity (code, real_part (e1), real_part (e2));
+   tree ieq = build_float_identity (code, imaginary_part (e1), 
imaginary_part (e2));
 
-   tree result = build_call_expr (tmemcmp, 3, build_address (t1),
-  build_address (t2), size);
-   this->result_ = build_boolop (code, result, integer_zero_node);
+   if (code == EQ_EXPR)
+ this->result_ = build_boolop (TRUTH_ANDIF_EXPR, req, ieq);
+   else
+ this->result_ = build_boolop (TRUTH_ORIF_EXPR, req, ieq);
+  }
+else if (tb1->isfloating () && tb1->ty != Tvector)
+  {
+   tree e1 = d_save_expr (build_expr (e->e1));
+   tree e2 = d_save_expr (build_expr (e->e2));
+   this->result_ = build_float_identity (code, e1, e2);
   }
 else if (tb1->ty == Tstruct)
   {
diff --git a/gcc/testsuite/gdc.dg/runnable.d b/gcc/testsuite/gdc.dg/runnable.d
index ec172fae810..65c71e86292 100644
--- a/gcc/testsuite/gdc.dg/runnable.d
+++ b/gcc/testsuite/gdc.dg/runnable.d
@@ -1534,6 +1534,27 @@ void test286()
 assert(0);
 }
 
+/**/
+// https://bugzilla.gdcproject.org/show_bug.cgi?id=309
+
+void test309()
+{
+creal f1 = +0.0 + 0.0i;
+creal f2 = +0.0 - 0.0i;
+creal f3 = -0.0 + 0.0i;
+creal f4 = +0.0 + 0.0i;
+
+assert(f1 !is f2);
+assert(f1 !is f3);
+assert(f2 !is f3);
+assert(f1 is f4);
+
+assert(!(f1 is f2));
+assert(!(f1 is f3));
+assert(!(f2 is f3));
+assert(!(f1 !is f4));
+}
+
 /**/
 
 void main()
@@ -1571,6 +1592,7 @@ void main()
 test273();
 test285();
 test286();
+test309();
 
 printf("Success!\n");
 }
-- 
2.19.1



[PATCH] Fix PR68212, unrolled loop no longer aligned

2018-11-28 Thread Pat Haugen
The following patch fixes the case where unrolling in the absence of profile 
information can cause a loop to no longer look hot and therefor not get 
aligned. In this case, instead of dividing by the unroll factor we now just 
scale by profile_probability::likely (). The diff looks worse than what really 
changed, just the addition of an if test and putting the existing 'scalemain' 
setting code into the else leg. Bootstrap/regtest on powerpc64le with no new 
regressions. I also ran a CPU2006 comparison, there were 4 benchmarks that 
improved 2-3% with the others in the noise range.

Ok for trunk?

-Pat



2018-11-28  Pat Haugen  

PR rtl-optimization/68212
* cfgloopmanip.c (duplicate_loop_to_header_edge): Adjust scale factor.

testsuite/ChangeLog:
2018-11-28  Pat Haugen  

PR rtl-optimization/68212
* gcc.dg/pr68212.c: New test.



Index: gcc/cfgloopmanip.c
===
--- gcc/cfgloopmanip.c  (revision 266522)
+++ gcc/cfgloopmanip.c  (working copy)
@@ -1242,16 +1242,25 @@ duplicate_loop_to_header_edge (struct lo
  profile_probability prob_pass_main = bitmap_bit_p (wont_exit, 0)
? prob_pass_wont_exit
: prob_pass_thru;
- profile_probability p = prob_pass_main;
- profile_count scale_main_den = count_in;
- for (i = 0; i < ndupl; i++)
+
+ /* If not using real profile data then don't scale the loop by ndupl.
+This can lead to the loop no longer looking hot wrt surrounding
+code.  */
+ if (profile_status_for_fn (cfun) == PROFILE_GUESSED)
+   scale_main = profile_probability::likely ();
+ else
{
- scale_main_den += count_in.apply_probability (p);
- p = p * scale_step[i];
+ profile_probability p = prob_pass_main;
+ profile_count scale_main_den = count_in;
+ for (i = 0; i < ndupl; i++)
+   {
+ scale_main_den += count_in.apply_probability (p);
+ p = p * scale_step[i];
+   }
+ /* If original loop is executed COUNT_IN times, the unrolled
+loop will account SCALE_MAIN_DEN times.  */
+ scale_main = count_in.probability_in (scale_main_den);
}
- /* If original loop is executed COUNT_IN times, the unrolled
-loop will account SCALE_MAIN_DEN times.  */
- scale_main = count_in.probability_in (scale_main_den);
  scale_act = scale_main * prob_pass_main;
}
   else
Index: gcc/testsuite/gcc.dg/pr68212.c
===
--- gcc/testsuite/gcc.dg/pr68212.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr68212.c  (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-vectorize -funroll-loops --param 
max-unroll-times=4 -fdump-rtl-alignments" } */
+
+void foo(long int *a, long int *b, long int n)
+{
+  long int i;
+
+  for (i = 0; i < n; i++)
+a[i] = *b;
+}
+
+/* { dg-final { scan-rtl-dump-times "internal loop alignment added" 1 
"alignments"} } */



[PATCH, rs6000] Fix PR87496: ICE in aggregate_value_p at gcc/function.c:2046

2018-11-28 Thread Peter Bergner
PR87496 shows a bug where we ICE if we attempt to use -mabi=ieeelongdouble
and -mno-popcntd.  The IEEE128 support requires full ISA 2.06 (aka POWER7)
support, so we really should throw an error when using those options
together.  Ditto for -mabi=ieeelongdouble and -mno-vsx.  The patch below
does that.

Ok for mainline once bootstrap and regtesting are complete and clean?

Peter


gcc/
PR target/87496
* config/rs6000/rs6000.c (rs6000_option_override_internal): Disallow
-mabi=ieeelongdouble without both -mpopcntd and -mvsx.

gcc/testsuite/
PR target/87496
* gcc.target/powerpc/pr87496.c: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 266566)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -4291,16 +4291,22 @@ rs6000_option_override_internal (bool gl
   if (!global_options_set.x_rs6000_ieeequad)
 rs6000_ieeequad = TARGET_IEEEQUAD_DEFAULT;
 
-  else if (rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT && 
TARGET_LONG_DOUBLE_128)
+  else
 {
-  static bool warned_change_long_double;
-  if (!warned_change_long_double)
+  if (!TARGET_POPCNTD || !TARGET_VSX)
+   error ("%qs requires full ISA 2.06 support", "-mabi=ieeelongdouble");
+
+  if (rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT && TARGET_LONG_DOUBLE_128)
{
- warned_change_long_double = true;
- if (TARGET_IEEEQUAD)
-   warning (OPT_Wpsabi, "Using IEEE extended precision long double");
- else
-   warning (OPT_Wpsabi, "Using IBM extended precision long double");
+ static bool warned_change_long_double;
+ if (!warned_change_long_double)
+   {
+ warned_change_long_double = true;
+ if (TARGET_IEEEQUAD)
+   warning (OPT_Wpsabi, "Using IEEE extended precision long 
double");
+ else
+   warning (OPT_Wpsabi, "Using IBM extended precision long 
double");
+   }
}
 }
 
Index: gcc/testsuite/gcc.target/powerpc/pr87496.c
===
--- gcc/testsuite/gcc.target/powerpc/pr87496.c  (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr87496.c  (working copy)
@@ -0,0 +1,10 @@
+/* PR target/87496.c */
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2 -mcpu=power7 -mabi=ieeelongdouble -mno-popcntd 
-Wno-psabi" } */
+
+int i;
+
+/* { dg-error "'-mabi=ieeelongdouble' requires full ISA 2.06 support" 
"PR87496" { target *-*-* } 0 } */



[Bug c++/87531] [8/9 Regression] assignment operator does nothing if performed as a call via operator=

2018-11-28 Thread nathan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87531

--- Comment #4 from Nathan Sidwell  ---
Author: nathan
Date: Wed Nov 28 21:25:06 2018
New Revision: 266590

URL: https://gcc.gnu.org/viewcvs?rev=266590=gcc=rev
Log:
[PR c++/87531] operator= lookup in templates

https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02301.html
PR c++/87531
* class.c (finish_struct): In a template, add artificial using
decl for operator=.

* g++.dg/lookup/pr87531.C: New.

Added:
trunk/gcc/testsuite/g++.dg/lookup/pr87531.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/class.c
trunk/gcc/testsuite/ChangeLog

[Bug libstdc++/87846] std::filesystem::create_directories with a path with a trailing slash does not create any directory

2018-11-28 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87846

--- Comment #1 from Jonathan Wakely  ---
The implementation of create_directories worked for the Filesystem TS but I
didn't update it to cope with the different C++17 semantics for a trailing
slash. The empty filename at the end of the path confused the function into
thinking it didn't need to create anything.

[PATCH 1/2] [og8] Further OpenACC reference-counting improvements

2018-11-28 Thread Julian Brown

This is the main set of improvements to reference-counting behaviour
(see parent email for further details).

ChangeLog

libgomp/
* libgomp.h (splay_tree_key_s): Substitute dynamic_refcount field for
virtual_refcount.
(acc_dispatch_t): Remove data_environ field.
(gomp_acc_insert_pointer, gomp_acc_data_env_remove_tgt): Remove
prototypes.
(gomp_acc_remove_pointer): Update prototype.
* oacc-async.c (goacc_remove_var_async): New function.
* oacc-host.c (host_dispatch): Don't initialise removed data_environ
field.
* oacc-init.c (acc_shutdown_1): Use gomp_remove_var instead of
gomp_unmap_vars to remove mappings by splay tree key instead of target
memory descriptor.
* oacc-int.h (splay_tree_key_s): Add forward declaration.
(goacc_remove_car_async): Add prototype.
* oacc-mem.c (gomp_acc_data_env_remove, gomp_acc_data_env_remove_tgt):
Remove functions.
(present_create_copy): Use virtual_refcount instead of dynamic_refcount,
and don't modify after calling gomp_map_vars_async.  Don't create dummy
target_mem_desc.  Fix target pointer return value.
(delete_copyout): Update for virtual_refcount semantics.  Use
goacc_remove_var_async for asynchronous delete/copyouts.
(gomp_acc_insert_pointer): Remove function.
(gomp_acc_remove_pointer): Use virtual_refcount semantics.
* oacc-parallel.c (find_pointer): Add missing GOMP_MAP_FORCE_DETACH
case.
(GOACC_enter_exit_data): Fix struct mapping/unmapping for
virtual_refcount semantics.  Fix attach/detach behaviour.  Don't call
gomp_acc_insert_pointer.
* target.c (gomp_map_vars_existing): Fix initialisation of do_detach
field.
(gomp_map_vars_async): Handle GOMP_MAP_VARS_OPENACC_ENTER_DATA.  Update
for virtual_refcount semantics.  Add some missing initialisations in
dynamic array code paths.
(gomp_unmap_tgt): Don't call gomp_acc_data_env_remove_tgt.
(gomp_remove_var): Fix use-after-free.
(gomp_unmap_vars_async): Update for virtual_refcount semantics.
(gomp_load_image_to_device): Don't use tgt's variable list to store
static function and variable mappings. Initialise virtual refcount.
(gomp_target_init): Don't initialise removed data_environ field.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: Update test for
fixed refcount behaviour.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: Likewise.
---
 libgomp/libgomp.h  |   22 +--
 libgomp/oacc-async.c   |   18 ++
 libgomp/oacc-host.c|2 -
 libgomp/oacc-init.c|6 +-
 libgomp/oacc-int.h |5 +
 libgomp/oacc-mem.c |  206 +---
 libgomp/oacc-parallel.c|  127 ++---
 libgomp/target.c   |   63 ---
 .../libgomp.oacc-c-c++-common/deep-copy-7.c|   11 +-
 .../libgomp.oacc-c-c++-common/deep-copy-8.c|1 +
 10 files changed, 189 insertions(+), 272 deletions(-)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 568e260..ea44afc 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -860,8 +860,11 @@ struct splay_tree_key_s {
   uintptr_t tgt_offset;
   /* Reference count.  */
   uintptr_t refcount;
-  /* Dynamic reference count.  */
-  uintptr_t dynamic_refcount;
+  /* Reference counts beyond those that represent genuine references in the
+ linked splay tree key/target memory structures, e.g. for multiple OpenACC
+ "present increment" operations (via "acc enter data") refering to the same
+ host-memory block.  */
+  uintptr_t virtual_refcount;
   /* For a block with attached pointers, the attachment counters for each.  */
   unsigned short *attach_count;
   /* Pointer to the original mapping of "omp declare target link" object.  */
@@ -887,13 +890,6 @@ splay_compare (splay_tree_key x, splay_tree_key y)
 
 typedef struct acc_dispatch_t
 {
-  /* This is a linked list of data mapped using the
- acc_map_data/acc_unmap_data or "acc enter data"/"acc exit data" pragmas.
- Unlike mapped_data in the goacc_thread struct, unmapping can
- happen out-of-order with respect to mapping.  */
-  /* This is guarded by the lock in the "outer" struct gomp_device_descr.  */
-  struct target_mem_desc *data_environ;
-
   /* Execute.  */
   __typeof (GOMP_OFFLOAD_openacc_exec) *exec_func;
   __typeof (GOMP_OFFLOAD_openacc_exec_params) *exec_params_func;
@@ -1010,9 +1006,9 @@ enum gomp_map_vars_kind
 
 struct gomp_coalesce_buf;
 
-extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *, int);
-extern void gomp_acc_remove_pointer (void **, size_t *, unsigned short *,
- int, 

[PATCH 0/2] [og8] Further OpenACC/libgomp refcounting fixes

2018-11-28 Thread Julian Brown

As mentioned in:

https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01773.html

I had a few more changes planned for the reference-counting implementation
for OpenACC in libgomp.  These are embodied in this patch series.
The highlights are:

 - reference counts in the linked memory-mapping splay tree structure can be
   self-checked for consistency using optional (i.e. development-only)
   code.  This survives a libgomp test run (with offloading to nvptx),
   so I'm reasonably confident it's good.

 - the "data_environ" field in the device descriptor -- a linear linked
   list containing a target memory descriptor for each "acc enter data"
   mapping -- has been removed.  This brings OpenACC closer to the OpenMP
   implementation for non-lexically-scoped data mapping
   (GOMP_target_enter_exit_data), and is potentially a performance win
   if lots of data is mapped in this way.

 - the semantics of the "dynamic_refcount" field in the splay_tree_key
   structure have shifted slightly, so I've renamed the field.  It now
   represents references that are excess to those represented by actual
   pointers in the linked splay tree/target-memory descriptor structure.
   That might have been the intention before in fact, but the
   implementation was inconsistent.

I will apply to the og8 branch shortly.

Julian Brown (2):
  [og8] Further OpenACC reference-counting improvements
  [og8] OpenACC reference count consistency checking

 libgomp/libgomp.h  |   55 +++--
 libgomp/oacc-async.c   |   64 +
 libgomp/oacc-host.c|2 -
 libgomp/oacc-init.c|6 +-
 libgomp/oacc-int.h |5 +
 libgomp/oacc-mem.c |  206 
 libgomp/oacc-parallel.c|  160 +++-
 libgomp/target.c   |  262 ++--
 .../libgomp.oacc-c-c++-common/deep-copy-7.c|   11 +-
 .../libgomp.oacc-c-c++-common/deep-copy-8.c|1 +
 10 files changed, 499 insertions(+), 273 deletions(-)



[PATCH 2/2] [og8] OpenACC reference count consistency checking

2018-11-28 Thread Julian Brown

This is the reference count consistency-checking code.  The model used
for checking is as follows.

 1. Each splay tree key that references a target memory descriptor
increases that descriptor's refcount by 1.

 2. Each variable listed in a target memory descriptor that links back to a
splay tree key increases that key's refcount by 1. Each target memory
descriptor's variable list is counted only once, even if multiple
splay tree keys point to it (via their "tgt" field).

 3. Additional ("real") target memory descriptors may be present
representing data mapped through "acc data" or "acc parallel/kernels"
blocks.  These descriptors have their refcount bumped, and the
variables linked through such blocks have their refcounts bumped also
(again, with "once only" semantics).

 4. Asynchronous operations "artificially" bump the reference counts for
referenced target memory descriptors (but *not* for linked
variables/splay tree keys), in order to delay freeing mapped device
memory until the asynchronous operation has completed.  We model this,
for checking purposes only, using an off-side linked list.

 5. "Virtual" reference counts ("virtual_refcount") cannot be checked
purely statically, so we add the incoming value to each key's
statically-determined reference count ("refcount_chk"), and make
sure that the total matches the incoming reference count ("refcount").

With the previous patch, as noted in the parent email, this allows a
libgomp test run to complete successfully (with checking enabled).

Julian

ChangeLog

libgomp/
* libgomp.h (RC_CHECKING): New macro, disabled by default, guarding all
hunks in this patch.
(target_mem_desc): Add forward declaration.
(async_tgt_use): New struct.
(target_mem_desc): Add refcount_chk, mark fields.
(acc_dispatch_t): Add tgt_uses, au_lock fields.
(dump_tgt, gomp_rc_check): Add prototypes.
* oacc-async (goacc_async_unmap_tgt): Add refcount self-check code.
(goacc_async_copyout_unmap_vars): Likewise.
(goacc_remove_var_async): Likewise.
* oacc-parallel.c (GOACC_parallel_keyed_internal): Add refcount
self-check code.
(GOACC_data_start, GOACC_data_end, GOACC_enter_exit_data): Likewise.
* target.c (stdio.h): Include.
(dump_tgt, rc_check_clear, rc_check_count, rc_check_verify)
(gomp_rc_check): New functions to consistency-check reference counts.
(gomp_target_init): Initialise self-check-related device fields.
---
 libgomp/libgomp.h   |   33 -
 libgomp/oacc-async.c|   46 +++
 libgomp/oacc-parallel.c |   33 
 libgomp/target.c|  199 +++
 4 files changed, 310 insertions(+), 1 deletions(-)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index ea44afc..77cc923 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -814,9 +814,26 @@ struct target_var_desc {
   uintptr_t length;
 };
 
+/* Uncomment to enable reference-count consistency checking (for development
+   use only).  */
+/*#define RC_CHECKING 1*/
+
+#ifdef RC_CHECKING
+struct target_mem_desc;
+
+struct async_tgt_use {
+  struct target_mem_desc *tgt;
+  struct async_tgt_use *next;
+};
+#endif
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
+#ifdef RC_CHECKING
+  uintptr_t refcount_chk;
+  bool mark;
+#endif
   /* All the splay nodes allocated together.  */
   splay_tree_node array;
   /* Start of the target region.  */
@@ -865,6 +882,10 @@ struct splay_tree_key_s {
  "present increment" operations (via "acc enter data") refering to the same
  host-memory block.  */
   uintptr_t virtual_refcount;
+#ifdef RC_CHECKING
+  /* The recalculated reference count, for verification.  */
+  uintptr_t refcount_chk;
+#endif
   /* For a block with attached pointers, the attachment counters for each.  */
   unsigned short *attach_count;
   /* Pointer to the original mapping of "omp declare target link" object.  */
@@ -899,7 +920,11 @@ typedef struct acc_dispatch_t
 int nasyncqueue;
 struct goacc_asyncqueue **asyncqueue;
 struct goacc_asyncqueue_list *active;
-
+#ifdef RC_CHECKING
+struct async_tgt_use *tgt_uses;
+gomp_mutex_t au_lock;
+#endif
+
 __typeof (GOMP_OFFLOAD_openacc_async_construct) *construct_func;
 __typeof (GOMP_OFFLOAD_openacc_async_destruct) *destruct_func;
 __typeof (GOMP_OFFLOAD_openacc_async_test) *test_func;
@@ -1028,6 +1053,12 @@ extern void gomp_detach_pointer (struct gomp_device_descr *,
  struct goacc_asyncqueue *, splay_tree_key,
  uintptr_t, bool, struct gomp_coalesce_buf *);
 
+#ifdef RC_CHECKING
+extern void dump_tgt (const char *, struct target_mem_desc *);
+extern void gomp_rc_check (struct gomp_device_descr *,
+			   struct target_mem_desc *);
+#endif
+
 extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *,
 	  

[PATCH v3] Make function clone name numbering independent.

2018-11-28 Thread Michael Ploujnikov
Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html

It took longer than expected, but I've finally rebased this on top of
the new clone_function_name* API and included the requested
optimizations.

There are a few remaining spots where we could probably apply similar
optimizations:

- gcc/multiple_target.c:create_target_clone
- gcc/multiple_target.c:create_dispatcher_calls
- gcc/omp-expand.c:grid_expand_target_grid_body

But I've yet to figure out how these work in detail and with the new
API these shouldn't block the main change from being merged.

I've also included a small change to rs6000 which I'm pretty sure is
safe, but I have no way of testing.

I'm not sure what's the consensus on what's more appropriate, but I
thought that it would be a good idea to keep these changes as separate
patches since only the first one really changes behaviour, while the
rest are optimizations. It's conceivable that someone who is
backporting this to an older release might want to just backport the
core bits of the change. I can re-submit it as one patch if that's
more appropriate.

Everything in these patches was bootstrapped and regression tested on
x86_64.

Ok for trunk?

- Michael
From 40cb5c888522d69bc42791f0c884dcb9e29eff37 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 1 Nov 2018 12:57:30 -0400
Subject: [PATCH 1/4] Make function assembly more independent.

This is achieved by having clone_function_name assign unique clone
numbers for each function independently.

gcc:

2018-11-28  Michael Ploujnikov  

	* cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids;
	  hash map.
	(clone_function_name_numbered): Use clone_fn_ids.

gcc/testsuite:

2018-11-28  Michael Ploujnikov  

	* gcc.dg/independent-cloneids-1.c: New test.
---
 gcc/cgraphclones.c| 10 -
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index e17959c0ca..fdd84b60d3 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map *clone_fn_ids;
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
from the string SUFFIX, the new name will end with a unique (for
@@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num;
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
-  return clone_function_name (name, suffix, clone_fn_id_num++);
+  /* Initialize the function->counter mapping the first time it's
+ needed.  */
+  if (!clone_fn_ids)
+clone_fn_ids = hash_map::create_ggc (64);
+  unsigned int _counter = clone_fn_ids->get_or_insert (
+   IDENTIFIER_POINTER (get_identifier (name)));
+  return clone_function_name (name, suffix, suffix_counter++);
 }
 
 /* Return a new assembler name for a clone of DECL.  Apart from string
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 00..3203895492
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+ function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } */
-- 
2.19.1

From 80b07ddf059415c896cecb9862517899db3993e9 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Mon, 17 Sep 2018 16:02:27 -0400
Subject: [PATCH 2/4] Minimize clone counter memory usage in
 create_virtual_clone.

Based on Martin Jambour's suggestion:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00111.html

gcc:

2018-11-28  Michael Ploujnikov  

	* cgraph.h (cgraph_node::create_virtual_clone): Add a 

Re: [PATCH] PR880088 Enable -Wtrampolines for no-exec-stack targets with -Wall.

2018-11-28 Thread Segher Boessenkool
On Tue, Nov 27, 2018 at 08:54:07PM +0100, Mark Wielaard wrote:
> On Tue, 2018-11-27 at 12:37 -0600, Segher Boessenkool wrote:
> > > Define a new target macro TARGET_HAS_DEFAULT_NOEXEC_STACK for those
> > > targets
> > > that have a non-executable default stack based on when they call
> > > file_end_indicate_exec_stack.
> > 
> > As Paul says, that name isn't so good.
> > 
> > TARGET_NEEDS_MAKING_THE_STACK_EXECUTABLE_FOR_TRAMPOLINES, or similar?
> 
> Would the slightly shorter
> TARGET_NEEDS_EXEC_STACK_MARKER_FOR_TRAMPOLINES be descriptive enough?

"MARKER", is that some official name for it?  If no, just "FLAG"?
Fine with me, sure.

> > > diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h
> > > index 0c67634..9330acf 100644
> > > --- a/gcc/config/rs6000/sysv4.h
> > > +++ b/gcc/config/rs6000/sysv4.h
> > > @@ -972,6 +972,11 @@ ncrtn.o%s"
> > >  /* Generate entries in .fixup for relocatable addresses.  */
> > >  #define RELOCATABLE_NEEDS_FIXUP 1
> > >  
> > > +#if defined (POWERPC_LINUX) || defined (POWERPC_FREEBSD)
> > > +  #define TARGET_HAS_DEFAULT_NOEXEC_STACK (TARGET_32BIT \
> > > +|| DEFAULT_ABI == ABI_ELFv2)
> > > +#endif
> > 
> > I don't think this belongs in sysv4.h .
> 
> I might have gotten lost in the tree of defines/macros.
> 
> There are two sysv4.h files gcc/config/rs6000/sysv4.h and
> gcc/config/powerpcspe/sysv4.h

Forget about powerpcspe please, I am talking about rs6000 only.

You want linux.h and freebsd.h, maybe the "64" versions of those separately.
Or put this in rs6000.h.  sysv4.h is a random header for this, it doesn't
belong there.


Segher


Re: [PATCH v4] Repeat jump threading after combine

2018-11-28 Thread Segher Boessenkool
Hi!

On Tue, Nov 27, 2018 at 05:07:11PM +0100, Ilya Leoshkevich wrote:
> perf diff -c wdiff:1,1 shows, that there is just one function
> (htab_traverse) that is significantly slower now:
> 
>  2.98% 11768891764  exe[.] htab_traverse
>  1.91%   563949986  exe[.] 
> compute_dominance_frontiers_1
> 
> The additional cycles consumed by this function matches the overall
> number of additionaly consumed cycles, and the contribution of the
> runner up (compute_dominance_frontiers_1) is 20 times smaller, so I
> think it's really just this one function.
> 
> However, the generated assembly is completely identical in both cases!

Ugh.  We have seen this before :-(

Thanks for investigating  I don't consider the Power degradation as really
caused by your patch, then.

> I saw similar situations in the past, so I tried adding a nop to
> htab_traverse:
> 
> --- hashtab.c
> +++ hashtab.c
> @@ -529,6 +529,8 @@ htab_traverse (htab, callback, info)
>   htab_trav callback;
>   PTR info;
>  {
> +  __asm__ volatile("nop\n");
> +
>PTR *slot = htab->entries;
>PTR *limit = slot + htab->size;
> 
> and made a 5x re-run.  The new measurements are 227.01s and 227.44s
> (+0.19%).  With two nops I get 227.25s and 227.29s (+0.02%), which also
> looks like noise.
> 
> Can this be explained by some microarchitectural quirk after all?

Two frequent branch targets that get thrown into the same bin for prediction.
Results change based on random compiler changes, ASLR settings, phase of the
moon, how many people in your neighbourhood have had porridge for breakfast
this morning, etc.


Segher


Re: [PING^3] Re: [PATCH 1/3] Support instrumenting returns of instrumented functions

2018-11-28 Thread Andi Kleen
On Wed, Nov 28, 2018 at 05:24:37PM +0100, Jan Hubicka wrote:
> > On Wed, Nov 28, 2018 at 04:59:25PM +0100, Jan Hubicka wrote:
> > > Hi,
> > > I wonder how __return_loc sections work.  In general it would be nice to
> > 
> > You mean how it's used? It's for patching in/out return instrumentation
> > at runtime, and to find them we use the sections.
> > 
> > > avoid direct output of assembler code in favour of adding instructions
> > > to rtl stream which should be possible to insert the call and return.
> > 
> > That would be a full rewrite how all these hooks work. I was 
> > hoping extending the existing frame work would be good enough for now.
> 
> Yep, it is ugly and it hits us ocassionaly as with the CET
> instrumentation, but I suppose we should care about this in stage1.
> > 
> > > __return_loc seems harder, also I wonder if it makes unwind information
> > > possible?
> > 
> > Do we need unwind information for calls? I thought it was only
> > for other stack manipulation.
> 
> I just wonder if backtraces are right if something triggers in the extra
> code.  If you make control flow to jump into separate section, I suppose
> backtraces won't be right.  Not sure if that is necessarily issue?

I don't think it is because the instrumentation only adds calls, and
calls don't get annotated in DWARF. The only issue I could think of
if is something patches in push instructions through the nops, 
but there is really nothing the compiler could do about that.

I tested gdb and it can backtrace through the return instrumentation.

Breakpoint 1, 0x00401182 in __return__ ()
(gdb) bt
#0  0x00401182 in __return__ ()
#1  0x004011a3 in f2 ()
#2  0x004011b7 in main ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

-Andi


Fix minnor issue with INTEGER_CST merging in lto.c

2018-11-28 Thread Jan Hubicka
Hi,
this patch fixes handling of INTEGER_CST for lto_read_decls. The
condition here ought to match conditional when tree-streamer-out and
tree.c expect integer constants to be shared.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 266450)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2018-11-28  Jan Hubicka  
+
+   * lto.c (lto_read_decls): Fix handling of INTEGER_CST.
+
 2018-11-22  Jan Hubicka  
 
* lto.c (cmp_type_location): Remove.
Index: lto.c
===
--- lto.c   (revision 266450)
+++ lto.c   (working copy)
@@ -1764,7 +1764,8 @@ lto_read_decls (struct lto_file_decl_dat
 from);
  if (len == 1
  && (TREE_CODE (first) == IDENTIFIER_NODE
- || TREE_CODE (first) == INTEGER_CST))
+ || (TREE_CODE (first) == INTEGER_CST
+ && !TREE_OVERFLOW (first
continue;
 
  /* Try to unify the SCC with already existing ones.  */


Re: *ping*^2 – PR C++/88114 - PATCH for destructor not generated for "virtual ~destructor() = default"

2018-11-28 Thread Tobias Burnus

On the 25th November 2018, schrieb Tobias Burnus wrote:


On 21 November 2018, Tobias Burnus wrote:

Hello all,

if a class contains any 'virtual ... = 0', it's an abstract class and 
for an

abstract class, the destructor not added to the vtable.

For a normal
   virtual ~class() { }
that's not a problem as the class::~class() destructor will be 
generated during

the parsing of the function.

But for
   virtual ~class() = default;
the destructor will be generated via mark_used via the vtable.


If one now declares a derived class and uses it, the class::~class() 
is generated
in that translation unit.  Unless, #pragma interface/implementation 
is used.


In that case, the 'default' destructor will never be generated.


The following code seems to work both for the big code and for the 
example;
without '#pragma implementation', the destructor is not generated for 
the example,

only with.

The patch survived boostrapping GCC with default languages on 
x86-64-gnu-linux

and "make check-g++".*

[One probably could get rid of some of the conditions for generating 
the code,
e.g. TREE_USED and DECL_DEFAULTED_FN are probably not both needed; 
one might

want to set some additional DECL to the fn decl.]

Does the patch and the test case make sense? Or is something else/in 
addition

needed?

Tobias


*I do get the following failures on this CentOS6 system:

FAIL: g++.dg/pr83239.C  -std=gnu++98 (test for excess errors)
Excess errors:
cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned 
int)' specified size 18446744073709551608 exceeds maximum object size 
9223372036854775807 [-Wstringop-overflow=]
cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned 
int)' specified size 18446744073709551600 exceeds maximum object size 
9223372036854775807 [-Wstringop-overflow=]


FAIL: g++.dg/tls/thread_local-order2.C  -std=c++14 execution test
FAIL: g++.dg/tls/thread_local-order2.C  -std=c++17 execution test

plus each 32 times:
FAIL: guality/guality.h: 0 PASS, 1 FAIL, 0 UNRESOLVED
FAIL: guality/guality.h: varl is -1, not 6


handle profile_probability::always better in tree-ssa-ifcombine

2018-11-28 Thread Jan Hubicka
Hi,
this patch fixes similar issue in update_profile_after_ifcombine
(in Jakub's testcase the conditional is first combined and later split
again and in both cases we got update unnecesarily imprecise).

Bootstrapped/regtested x86_64-linux, comitted.

* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Handle
profile_probability::always better.

Index: tree-ssa-ifcombine.c
===
--- tree-ssa-ifcombine.c(revision 266450)
+++ tree-ssa-ifcombine.c(working copy)
@@ -360,8 +360,15 @@ update_profile_after_ifcombine (basic_bl
 
   inner_cond_bb->count = outer_cond_bb->count;
 
-  inner_taken->probability = outer2->probability + outer_to_inner->probability
-* inner_taken->probability;
+  /* Handle special case where inner_taken probability is always. In this case
+ we know that the overall outcome will be always as well, but combining
+ probabilities will be conservative because it does not know that
+ outer2->probability is inverse of outer_to_inner->probability.  */
+  if (inner_taken->probability == profile_probability::always ())
+;
+  else
+inner_taken->probability = outer2->probability + 
outer_to_inner->probability
+  * inner_taken->probability;
   inner_not_taken->probability = profile_probability::always ()
 - inner_taken->probability;
 


handle profile_probability::always in profile_count::split better

2018-11-28 Thread Jan Hubicka
Hi,
this fixes (minor) profile updating issue which was uncovered by
somewhat devilish testcase Jakub accidentally created for branch
prediction code.

When expanding if (cond) goto xxx; gimple stmt with probability
profile_count::one into mutliple conditionals, we may end up predicting
whole chain with "guessed" quality which is unnecesary downgrade of
information we already have in CFG.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* profile-count.h (profile_count::split): Give better result when
splitting profile_probability::always.
Index: profile-count.h
===
--- profile-count.h (revision 266450)
+++ profile-count.h (working copy)
@@ -447,8 +447,12 @@ public:
 {
   profile_probability ret = *this * cprob;
   /* The following is equivalent to:
- *this = cprob.invert () * *this / ret.invert ();  */
-  *this = (*this - ret) / ret.invert ();
+ *this = cprob.invert () * *this / ret.invert ();
+Avoid scaling when overall outcome is supposed to be always.
+Without knowing that one is inverse of toher, the result would be
+conservative.  */
+  if (!(*this == profile_probability::always ()))
+*this = (*this - ret) / ret.invert ();
   return ret;
 }
 


Re: [PATCH][Version 3]Come up with -flive-patching master option.

2018-11-28 Thread Qing Zhao


> On Nov 28, 2018, at 9:52 AM, Jan Hubicka  wrote:
> 
>> 
>> 2018-11-20  qing zhao  
>> 
>>  * cif-code.def (EXTERN_LIVE_ONLY_STATIC): New CIF code.
>>  * common.opt: Add -flive-patching flag.
>>  * doc/invoke.texi: Document -flive-patching.
>>  * flag-types.h (enum live_patching_level): New enum.
>>  * ipa-inline.c (can_inline_edge_p): Disable external functions from
>>  inlining when flag_live_patching is LIVE_PATCHING_INLINE_ONLY_STATIC.
>>  * opts.c (control_options_for_live_patching): New function.
>>  (finish_options): Make flag_live_patching incompatible with flag_lto.
>>  Control IPA optimizations based on different levels of 
>>  flag_live_patching.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2018-11-20  qing zhao  
>> 
>>  * gcc.dg/live-patching-1.c: New test.
>>  * gcc.dg/live-patching-2.c: New test.
>>  * gcc.dg/live-patching-3.c: New test.
>>  * gcc.dg/tree-ssa/writeonly-3.c: New test.
>>  * gcc.target/i386/ipa-stack-alignment-2.c: New test.
>> 
> 
> I am still somewhat worried about possible use with C++ programs where
> we will kill all inlining of comdats, but I guess we could discuss that
> when it becomes an issue.

Okay. If this will be a problem later when we use live-patching in more C++ 
applications, let’s 
revisit it at that time. 

> +
> +  /* FIXME: disable unreachable code removal.  */
> 
> Disabling unreachable code removal will really introduce a lot of extra
> dead code, can't live patches just provide what they need if the code
> was earlier removed.
> +
> +  /* discovery of functions/variables with no address taken.  */
> +  if (opts_set->x_flag_ipa_reference_addressable
> +   && opts->x_flag_ipa_reference_addressable)
> + error_at (loc,
> +   "%<-fipa-reference-addressable%> is incompatible with "
> +   "%<-flive-patching=inline-only-static|inline-clone%>");
> +  else
> + opts->x_flag_ipa_reference_addressable = 0;
> +
> +  /* ipa stack alignment propagation.  */
> +  if (opts_set->x_flag_ipa_stack_alignment
> +   && opts->x_flag_ipa_stack_alignment)
> + error_at (loc,
> +   "%<-fipa-stack-alignment%> is incompatible with "
> +   "%<-flive-patching=inline-only-static|inline-clone%>");
> +  else
> + opts->x_flag_ipa_stack_alignment = 0;
> 
> Shall we also disable nothrow or we will worry about C++ only ter?

This is also mainly for C++ applications, so currently should not be a problem.
But I can add a separate simple patch to add another flag to control nothrow 
propagation and disable it when -flive-patching is ON.

> 
> Patch is OK,

thanks for the review.

I will commit the patch very soon.

Qing
> thanks!
> Honza



Re: V2 [PATCH] i386: Add pass_remove_partial_avx_dependency

2018-11-28 Thread Jan Hubicka
> On 11/28/18 12:48 PM, H.J. Lu wrote:
> > On Mon, Nov 5, 2018 at 7:29 AM Jan Hubicka  wrote:
> >>
> >>> On 11/5/18 7:21 AM, Jan Hubicka wrote:
> >
> > Did you mean "the nearest common dominator"?
> 
>  If the nearest common dominator appears in the loop while all uses are
>  out of loops, this will result in suboptimal xor placement.
>  In this case you want to split edges out of the loop.
> 
>  In general this is what the LCM framework will do for you if the problem
>  is modelled siimlar way as in mode_swtiching.  At entry function mode is
>  "no zero register needed" and all conversions need mode "zero register
>  needed".  Mode switching should then do the correct placement decisions
>  (reaching minimal number of executions of xor).
> 
>  Jeff, whan is your optinion on the approach taken by the patch?
>  It seems like a special case of more general issue, but I do not see
>  very elegant way to solve it at least in the GCC 9 horisont, so if
>  the placement is correct we can probalby go either with new pass or
>  making this part of mode swithcing (which is anyway run by x86 backend)
> >>> So I haven't followed this discussion at all, but did touch on this
> >>> issue with some patch a month or two ago with a target patch that was
> >>> trying to avoid the partial stalls.
> >>>
> >>> My assumption is that we're trying to find one or more places to
> >>> initialize the upper half of an avx register so as to avoid partial
> >>> register stall at existing sites that set the upper half.
> >>>
> >>> This sounds like a classic PRE/LCM style problem (of which mode
> >>> switching is just another variant).   A common-dominator approach is
> >>> closer to a classic GCSE and is going to result is more initializations
> >>> at sub-optimal points than a PRE/LCM style.
> >>
> >> yes, it is usual code placement problem. It is special case because the
> >> zero register is not modified by the conversion (just we need to have
> >> zero somewhere).  So basically we do not have kills to the zero except
> >> for entry block.
> >>
> > 
> > Do you have  testcase to show thatf the nearest common dominator
> > in the loop, while all uses areout of loops, leads to suboptimal xor
> > placement?
> I don't have a testcase, but it's all but certain nearest common
> dominator is going to be a suboptimal placement.  That's going to create
> paths where you're going to emit the xor when it's not used.
> 
> The whole point of the LCM algorithms is they are optimal in terms of
> expression evaluations.

i think testcase should be something like

test()
{
  while (true)
  {
 if (cond1)
   {
 do_one_conversion;
 return;
   }
 if (cond2)
   {
 do_other_conversion;
 return;
   }
  }
}

Honza
> 
> jeff
> > 
> 


Re: V2 [PATCH] i386: Add pass_remove_partial_avx_dependency

2018-11-28 Thread Jeff Law
On 11/28/18 12:48 PM, H.J. Lu wrote:
> On Mon, Nov 5, 2018 at 7:29 AM Jan Hubicka  wrote:
>>
>>> On 11/5/18 7:21 AM, Jan Hubicka wrote:
>
> Did you mean "the nearest common dominator"?

 If the nearest common dominator appears in the loop while all uses are
 out of loops, this will result in suboptimal xor placement.
 In this case you want to split edges out of the loop.

 In general this is what the LCM framework will do for you if the problem
 is modelled siimlar way as in mode_swtiching.  At entry function mode is
 "no zero register needed" and all conversions need mode "zero register
 needed".  Mode switching should then do the correct placement decisions
 (reaching minimal number of executions of xor).

 Jeff, whan is your optinion on the approach taken by the patch?
 It seems like a special case of more general issue, but I do not see
 very elegant way to solve it at least in the GCC 9 horisont, so if
 the placement is correct we can probalby go either with new pass or
 making this part of mode swithcing (which is anyway run by x86 backend)
>>> So I haven't followed this discussion at all, but did touch on this
>>> issue with some patch a month or two ago with a target patch that was
>>> trying to avoid the partial stalls.
>>>
>>> My assumption is that we're trying to find one or more places to
>>> initialize the upper half of an avx register so as to avoid partial
>>> register stall at existing sites that set the upper half.
>>>
>>> This sounds like a classic PRE/LCM style problem (of which mode
>>> switching is just another variant).   A common-dominator approach is
>>> closer to a classic GCSE and is going to result is more initializations
>>> at sub-optimal points than a PRE/LCM style.
>>
>> yes, it is usual code placement problem. It is special case because the
>> zero register is not modified by the conversion (just we need to have
>> zero somewhere).  So basically we do not have kills to the zero except
>> for entry block.
>>
> 
> Do you have  testcase to show thatf the nearest common dominator
> in the loop, while all uses areout of loops, leads to suboptimal xor
> placement?
I don't have a testcase, but it's all but certain nearest common
dominator is going to be a suboptimal placement.  That's going to create
paths where you're going to emit the xor when it's not used.

The whole point of the LCM algorithms is they are optimal in terms of
expression evaluations.

jeff
> 



[Bug fortran/87937] [8/9 Regression] LHS reallocation broken inside "select type" and "associate"

2018-11-28 Thread trnka at scm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87937

--- Comment #13 from Tomáš Trnka  ---
(In reply to Dominique d'Humieres from comment #12)
> I finally got it: the problem has been introduced in trunk by revision
> r264358 and fixed by r264725.

Good catch! (How could I have missed that?)

Backporting r264725 on top of the current gcc-8 branch fixes the issue for me.
Thanks!

[Bug middle-end/88251] -Wformat-truncation=2 false alarms when compiling gzip, Emacs

2018-11-28 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88251

--- Comment #2 from Martin Sebor  ---
The warning can also be avoided by using the snprintf return value and taking
some action based on it (just returning it from a function isn't enough).

patch to fix PR88207

2018-11-28 Thread Vladimir Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88207

  The patch was tested and bootstrapped on x86/x86-64.

  Committed as rev. 266582.

  The patch creates one new regression on pr34256.c.  But I think gcc 
with the patch generates the right code.  I'll send a patch modifying 
the test for discussion and/or approval later today.




Index: ChangeLog
===
--- ChangeLog	(revision 266581)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2018-11-28  Vladimir Makarov  
+
+	PR target/88207
+	* ira-costs.c (scan_one_insn): Process subregs when updating costs
+	for pseudos and allocnos from insn.
+
 2018-11-28  David Edelsohn  
 
 	* config/rs6000/aix72.h: Update to match aix71.h changes.
Index: ira-costs.c
===
--- ira-costs.c	(revision 266435)
+++ ira-costs.c	(working copy)
@@ -1535,36 +1535,40 @@ scan_one_insn (rtx_insn *insn)
   /* Now add the cost for each operand to the total costs for its
  allocno.  */
   for (i = 0; i < recog_data.n_operands; i++)
-if (REG_P (recog_data.operand[i])
-	&& REGNO (recog_data.operand[i]) >= FIRST_PSEUDO_REGISTER)
-  {
-	int regno = REGNO (recog_data.operand[i]);
-	struct costs *p = COSTS (costs, COST_INDEX (regno));
-	struct costs *q = op_costs[i];
-	int *p_costs = p->cost, *q_costs = q->cost;
-	cost_classes_t cost_classes_ptr = regno_cost_classes[regno];
-	int add_cost;
-
-	/* If the already accounted for the memory "cost" above, don't
-	   do so again.  */
-	if (!counted_mem)
-	  {
-	add_cost = q->mem_cost;
-	if (add_cost > 0 && INT_MAX - add_cost < p->mem_cost)
-	  p->mem_cost = INT_MAX;
-	else
-	  p->mem_cost += add_cost;
-	  }
-	for (k = cost_classes_ptr->num - 1; k >= 0; k--)
-	  {
-	add_cost = q_costs[k];
-	if (add_cost > 0 && INT_MAX - add_cost < p_costs[k])
-	  p_costs[k] = INT_MAX;
-	else
-	  p_costs[k] += add_cost;
-	  }
-  }
-
+{
+  rtx op = recog_data.operand[i];
+  
+  if (GET_CODE (op) == SUBREG)
+	op = SUBREG_REG (op);
+  if (REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER)
+	{
+	  int regno = REGNO (op);
+	  struct costs *p = COSTS (costs, COST_INDEX (regno));
+	  struct costs *q = op_costs[i];
+	  int *p_costs = p->cost, *q_costs = q->cost;
+	  cost_classes_t cost_classes_ptr = regno_cost_classes[regno];
+	  int add_cost;
+	  
+	  /* If the already accounted for the memory "cost" above, don't
+	 do so again.  */
+	  if (!counted_mem)
+	{
+	  add_cost = q->mem_cost;
+	  if (add_cost > 0 && INT_MAX - add_cost < p->mem_cost)
+		p->mem_cost = INT_MAX;
+	  else
+		p->mem_cost += add_cost;
+	}
+	  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
+	{
+	  add_cost = q_costs[k];
+	  if (add_cost > 0 && INT_MAX - add_cost < p_costs[k])
+		p_costs[k] = INT_MAX;
+	  else
+		p_costs[k] += add_cost;
+	}
+	}
+}
   return insn;
 }
 


[Bug target/88207] [9 regression] gcc.target/i386/pr22076.c etc. FAIL

2018-11-28 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88207

--- Comment #4 from Vladimir Makarov  ---
Author: vmakarov
Date: Wed Nov 28 20:08:03 2018
New Revision: 266582

URL: https://gcc.gnu.org/viewcvs?rev=266582=gcc=rev
Log:
2018-11-28  Vladimir Makarov  

PR target/88207
* ira-costs.c (scan_one_insn): Process subregs when updating costs
for pseudos and allocnos from insn.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/ira-costs.c

[Bug middle-end/88251] -Wformat-truncation=2 false alarms when compiling gzip, Emacs

2018-11-28 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88251

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
 Status|UNCONFIRMED |RESOLVED
 CC||msebor at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #1 from Martin Sebor  ---
The warning in this case is by design.  The test makes it clear that the buffer
can be as small as 2 bytes but the output is at least 15.  With
-Wformat-truncation=2 the buffer needs to be big enough for the longest output,
or 25 + 1 bytes.  This suppresses the warning:

  int
  rpl_strerror_r (int errnum, char *buf, size_t buflen)
  {
if (buflen <= 25)
  return 34;
return snprintf (buf, buflen, "Unknown error %d", errnum);
  }

[Bug c++/88222] [9 Regression] ubsan error at cp/decl.c for broken code

2018-11-28 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88222

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Marek Polacek  ---
Fixed.

[Bug c++/88222] [9 Regression] ubsan error at cp/decl.c for broken code

2018-11-28 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88222

--- Comment #3 from Marek Polacek  ---
Author: mpolacek
Date: Wed Nov 28 20:03:06 2018
New Revision: 266581

URL: https://gcc.gnu.org/viewcvs?rev=266581=gcc=rev
Log:
PR c++/88222 - ICE with bit-field with invalid type.
* decl.c (grokdeclarator): Check if declarator is null.

* g++.dg/ext/flexary31.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/ext/flexary31.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/decl.c
trunk/gcc/testsuite/ChangeLog

[Bug fortran/87937] [8/9 Regression] LHS reallocation broken inside "select type" and "associate"

2018-11-28 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87937

--- Comment #12 from Dominique d'Humieres  ---
I finally got it: the problem has been introduced in trunk by revision r264358
and fixed by r264725.

On the GCC8 branch the problem has been introduced by r264404 and AFAICT the
fix has not been back ported. If it helps I can try to apply the path in trunk
to my gcc8 build.

Note that the problem is not present in 8.2.0.

Re: [PATCH][rs6000] better use of unaligned vsx in memset() expansion

2018-11-28 Thread Segher Boessenkool
On Wed, Nov 28, 2018 at 01:24:01PM -0600, Aaron Sawdey wrote:
> The first version of this had a big bug and cleared past the requested bytes.
> This version passes regstrap on ppc64le(power7/8/9), ppc64be(power6/7/8),
> and ppc32(power8).
> 
> OK for trunk (and 8 backport after a week)?

> @@ -91,8 +93,7 @@
>rtx dest;
> 
>if (TARGET_ALTIVEC
> -   && ((bytes >= 16 && align >= 128)
> -   || (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX)))
> +   && (bytes >= 16 && ( align >= 128 || unaligned_vsx_ok)))

Please remove the stray space?  Okay for trunk and later for 8, thanks!


Segher


[Bug c++/88252] New: Deduction guide assume the constructor parameter is a forwarding reference if constructor defined outside class

2018-11-28 Thread okannen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88252

Bug ID: 88252
   Summary: Deduction guide assume the constructor parameter is a
forwarding reference if constructor defined outside
class
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: okannen at gmail dot com
  Target Milestone: ---

Constructor parameter that have the forms of a forwarding reference (T&&)
inside a constructor are not forwarding reference if T is a parameter of the
class template. But gcc assumes it is a forwarding reference if the constructor
is defined outside the class:


template
struct B1 {
B1(T&& x){}
};

template
struct B2 {
B2(T&& x) ;
};
template
B2::B2(T&& x){}

int i;
B1 a{i};//Expect: compilation error, class template argument deduction failed
B2 b{i};//Unexpected: no compilation error.

Re: C++ PATCH to implement P1094R2, Nested inline namespaces

2018-11-28 Thread Jason Merrill
On Wed, Nov 28, 2018 at 11:57 AM Marek Polacek  wrote:
>
> On Tue, Nov 27, 2018 at 04:01:46PM -0500, Jason Merrill wrote:
> > > > >  if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
> > > > > {
> > > > >   identifier = cp_parser_identifier (parser);
> > > > > @@ -18904,7 +18916,12 @@ cp_parser_namespace_definition (cp_parser* 
> > > > > parser)
> > > > > }
> > > > >  if (cp_lexer_next_token_is_not (parser->lexer, CPP_SCOPE))
> > > > > -   break;
> > > > > +   {
> > > > > + /* Don't forget that the innermost namespace might have been
> > > > > +marked as inline.  */
> > > > > + is_inline |= nested_inline_p;
> > > >
> > > > This looks wrong: an inline namespace does not make its nested 
> > > > namespaces
> > > > inline as well.
> >
> > > A nested namespace definition cannot be inline.  This is supposed to 
> > > handle
> > > cases such as
> > > namespace A::B::inline C { ... }
> > > because after 'C' we don't see :: so it breaks and we call push_namespace
> > > outside the for loop.
> >
> > Ah, yes, I see.  I was confused by the use of '|='; what is that for? Why
> > not use '='?  For that matter, why not modify is_inline in the loop instead
> > of a new nested_inline_p variable?
>
> |= is there to handle correctly this case:
> inline namespace N { ... }
> where we have to make sure that the call to push_namespace outside the for 
> (;;)
> is with is_inline=true.  Using '=' would overwrite is_inline to false, because
> there are no nested inlines.  For the same reason I needed a second bool: to
> not lose the information about the topmost inline.  But since the second
> push_namespace call also needs to handle the innermost namespace, it can't use
> topmost_inline_p.

Aha.  Please mention that in the comment.  OK with that change.

Jason


Re: V2 [PATCH] i386: Add pass_remove_partial_avx_dependency

2018-11-28 Thread H.J. Lu
On Mon, Nov 5, 2018 at 7:29 AM Jan Hubicka  wrote:
>
> > On 11/5/18 7:21 AM, Jan Hubicka wrote:
> > >>
> > >> Did you mean "the nearest common dominator"?
> > >
> > > If the nearest common dominator appears in the loop while all uses are
> > > out of loops, this will result in suboptimal xor placement.
> > > In this case you want to split edges out of the loop.
> > >
> > > In general this is what the LCM framework will do for you if the problem
> > > is modelled siimlar way as in mode_swtiching.  At entry function mode is
> > > "no zero register needed" and all conversions need mode "zero register
> > > needed".  Mode switching should then do the correct placement decisions
> > > (reaching minimal number of executions of xor).
> > >
> > > Jeff, whan is your optinion on the approach taken by the patch?
> > > It seems like a special case of more general issue, but I do not see
> > > very elegant way to solve it at least in the GCC 9 horisont, so if
> > > the placement is correct we can probalby go either with new pass or
> > > making this part of mode swithcing (which is anyway run by x86 backend)
> > So I haven't followed this discussion at all, but did touch on this
> > issue with some patch a month or two ago with a target patch that was
> > trying to avoid the partial stalls.
> >
> > My assumption is that we're trying to find one or more places to
> > initialize the upper half of an avx register so as to avoid partial
> > register stall at existing sites that set the upper half.
> >
> > This sounds like a classic PRE/LCM style problem (of which mode
> > switching is just another variant).   A common-dominator approach is
> > closer to a classic GCSE and is going to result is more initializations
> > at sub-optimal points than a PRE/LCM style.
>
> yes, it is usual code placement problem. It is special case because the
> zero register is not modified by the conversion (just we need to have
> zero somewhere).  So basically we do not have kills to the zero except
> for entry block.
>

Do you have  testcase to show thatf the nearest common dominator
in the loop, while all uses areout of loops, leads to suboptimal xor
placement?

-- 
H.J.


  1   2   3   4   >