[PATCH][DOCS][pushed] Improve JSON examples.

2021-06-08 Thread Martin Liška

The patch improves JSON examples so that they are a valid JSON.
That will help us with syntax highlighting in Sphinx-generated
documentation.

Pushed to master.
Martin

gcc/ChangeLog:

* doc/gcov.texi: Create a proper JSON files.
* doc/invoke.texi: Remove dots in order to make it a valid
JSON object.
---
 gcc/doc/gcov.texi   | 50 ++---
 gcc/doc/invoke.texi |  3 +--
 2 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index 32b51f984bc..6a5760e5ebe 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -191,11 +191,11 @@ Structure of the JSON is following:
 
 @smallexample

 @{
-  "current_working_directory": @var{current_working_directory},
-  "data_file": @var{data_file},
-  "format_version": @var{format_version},
-  "gcc_version": @var{gcc_version}
-  "files": [@var{file}]
+  "current_working_directory": "foo/bar",
+  "data_file": "a.out",
+  "format_version": "1",
+  "gcc_version": "11.1.1 20210510"
+  "files": ["$file"]
 @}
 @end smallexample
 
@@ -220,9 +220,9 @@ Each @var{file} has the following form:
 
 @smallexample

 @{
-  "file": @var{file_name},
-  "functions": [@var{function}],
-  "lines": [@var{line}]
+  "file": "a.c",
+  "functions": ["$function"],
+  "lines": ["$line"]
 @}
 @end smallexample
 
@@ -237,15 +237,15 @@ Each @var{function} has the following form:
 
 @smallexample

 @{
-  "blocks": @var{blocks},
-  "blocks_executed": @var{blocks_executed},
-  "demangled_name": "@var{demangled_name},
-  "end_column": @var{end_column},
-  "end_line": @var{end_line},
-  "execution_count": @var{execution_count},
-  "name": @var{name},
-  "start_column": @var{start_column}
-  "start_line": @var{start_line}
+  "blocks": 2,
+  "blocks_executed": 2,
+  "demangled_name": "foo",
+  "end_column": 1,
+  "end_line": 4,
+  "execution_count": 1,
+  "name": "foo",
+  "start_column": 5,
+  "start_line": 1
 @}
 @end smallexample
 
@@ -289,11 +289,11 @@ Each @var{line} has the following form:
 
 @smallexample

 @{
-  "branches": [@var{branch}],
-  "count": @var{count},
-  "line_number": @var{line_number},
-  "unexecuted_block": @var{unexecuted_block}
-  "function_name": @var{function_name},
+  "branches": ["$branch"],
+  "count": 2,
+  "line_number": 15,
+  "unexecuted_block": false,
+  "function_name": "foo",
 @}
 @end smallexample
 
@@ -320,9 +320,9 @@ Each @var{branch} has the following form:
 
 @smallexample

 @{
-  "count": @var{count},
-  "fallthrough": @var{fallthrough},
-  "throw": @var{throw}
+  "count": 11,
+  "fallthrough": true,
+  "throw": false
 @}
 @end smallexample
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

index 6063e466c13..24dc0491901 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -5149,8 +5149,7 @@ might be printed in JSON form (after formatting) like 
this:
 @}
 ]
"column-origin": 1,
-@},
-@dots{}
+@}
 ]
 @end smallexample
 
--

2.31.1



Re: GCC Mission Statement

2021-06-08 Thread Siddhesh Poyarekar

On 6/9/21 10:39 AM, Valentino Giudice wrote:

I was aware of that announcement, but it doesn't mention the mission
statement at all.
It appears that the decision in question was, at the time, in contrast
with the mission statement (rather than guided by it).

If the Steering Committee updates the mission statement, it may appear
that the mission statement follows the decisions of the steering
committee (in place of the contrary). In that case, what would be the
purpose of a mission statement?

The mission statement was also updated beyond simply making it
consistent with the change: in "Supporting the goals of the GNU
project, as defined by the FSF" the reference to the FSF was removed.


Quite a few projects under the GNU project[1] have dissociated 
themselves from the FSF, so "as defined by the FSF" perhaps doesn't 
apply as consistently as it did before.  That is my understanding 
anyway; maybe there's more context that others may be able to add.


Siddhesh

[1] https://gnu.tools/en/software/


[Bug fortran/100961] Intrinsic function as value to a class(*) assumed rank argument fails

2021-06-08 Thread mscfd at gmx dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100961

--- Comment #3 from martin  ---
Created attachment 50968
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50968=edit
second test case which segfaults

Playing around with some variants in select_rank_expression2.f90, I can see
that I sometimes get correct results, sometimes the rank is correctly
recognised, but not the type, and (as is the case for attachment
select_rank_expression2.f90) it even can segfault with an invalid memory
access. I get all these behaviours by selecting different sets of the four
"call p(..)" lines and varying the order in which they are executed.

[RFC/PATCH] ira: Consider matching constraints with param [PR100328]

2021-06-08 Thread Kewen.Lin via Gcc-patches
Hi,

PR100328 has some details about this issue, I am trying to
brief it here.  In the hottest function LBM_performStreamCollideTRT
of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
(27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
class have 64 registers whose foregoing 32 ones make up the
whole FLOAT_REG.  There are some differences for these two
flavors, taking "*fma4_fpr" as example:

(define_insn "*fma4_fpr"
  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa")
(fma:SFDF
  (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa")
  (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0")
  (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))]

// wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
//  (f/d) => A floating point register, aka. FLOAT_REG.

So for VSX_REG, we only have the destructive form, when VSX_REG
alternative being used, the operand 2 or operand 3 is required
to be the same as operand 0.  reload has to take care of this
constraint and create some non-free register copies if required.

Assuming one fma insn looks like:
  op0 = FMA (op1, op2, op3)

The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
IRA simply creates three shuffle copies for them (here the operand
order matters, since with the same freq, the one with smaller number
takes preference), but IMO both op2 and op3 should take higher priority
in copy queue due to the matching constraint.

I noticed that there is one function ira_get_dup_out_num, which meant
to create this kind of constraint copy, but the below code looks to
refuse to create if there is an alternative which has valid regclass
without spilled need. 

  default:
{
  enum constraint_num cn = lookup_constraint (str);
  enum reg_class cl = reg_class_for_constraint (cn);
  if (cl != NO_REGS
  && !targetm.class_likely_spilled_p (cl))
goto fail

 ...

I cooked one patch attached to make ira respect this kind of matching
constraint guarded with one parameter.  As I stated in the PR, I was
not sure this is on the right track.  The RFC patch is to check the
matching constraint in all alternatives, if there is one alternative
with matching constraint and matches the current preferred regclass
(or best of allocno?), it will record the output operand number and
further create one constraint copy for it.  Normally it can get the
priority against shuffle copies and the matching constraint will get
satisfied with higher possibility, reload doesn't create extra copies
to meet the matching constraint or the desirable register class when
it has to.

For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
as shuffle copies, and later any of A,B,C,D gets assigned by one
hardware register which is a VSX register (VSX_REG) but not a FP
register (FLOAT_REG), which means it has to pay costs once we can NOT
go with VSX alternatives, so at that time it's important to respect
the matching constraint then we can increase the freq for the remaining
copies related to this (A/B, A/C, A/D).  This idea requires some side
tables to record some information and seems a bit complicated in the
current framework, so the proposed patch aggressively emphasizes the
matching constraint at the time of creating copies.

Any comments are highly appreciated!

BR,
Kewen
---
 gcc/config/rs6000/rs6000.c |  3 ++
 gcc/ira.c  | 69 ++
 gcc/params.opt |  4 +++
 3 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5ae40d6f4ce..eb9c4284f91 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4852,6 +4852,9 @@ rs6000_option_override_internal (bool global_init_p)
 ap = __builtin_next_arg (0).  */
   if (DEFAULT_ABI != ABI_V4)
targetm.expand_builtin_va_start = NULL;
+
+  SET_OPTION_IF_UNSET (_options, _options_set,
+  param_ira_consider_dup_in_all_alts, 1);
 }
 
   rs6000_override_options_after_change ();
diff --git a/gcc/ira.c b/gcc/ira.c
index b93588d8a9f..beebee7499b 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -1937,10 +1939,16 @@ ira_get_dup_out_num (int op_num, alternative_mask alts)
 return -1;
   str = recog_data.constraints[op_num];
   use_commut_op_p = false;
+
+  rtx op = recog_data.operand[op_num];
+  int op_no = reg_or_subregno (op);
+  enum reg_class op_pref_cl = reg_preferred_class (op_no);
+  machine_mode op_mode = GET_MODE (op);
+
   for (;;)
 {
-  rtx op = recog_data.operand[op_num];
-  
+  bool saw_reg_cstr = false;
+
   for (curr_alt = 0, ignore_p = !TEST_BIT (alts, curr_alt),
   original = -1;;)
{
@@ -1963,9 +1971,25 @@ ira_get_dup_out_num (int op_num, alternative_mask alts)
{
  enum constraint_num cn = 

[Bug fortran/100961] Intrinsic function as value to a class(*) assumed rank argument fails

2021-06-08 Thread mscfd at gmx dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100961

--- Comment #2 from martin  ---
It is releases/gcc-11.1.0:
Using built-in specs.
COLLECT_GCC=gfortran-11
COLLECT_LTO_WRAPPER=.../gcc/lib/gcc/x86_64-linux-gnu/11.1.0/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../gcc-repo/configure --program-suffix=-11
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-arch=westmere --prefix=.../gcc --enable-languages=c,c++,fortran
--disable-multilib --disable-bootstrap --enable-checking=release
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.1.0 (GCC)

The code is compiled with "-g select_rank_expression.f90 -o
select_rank_expression.x".

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2021-06-08 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085

--- Comment #10 from luoxhu at gcc dot gnu.org ---
float128 to vector __int128 is fixed by:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f700e4b0ee3ef53b48975cf89be26b9177e3a3f3

Re: GCC Mission Statement

2021-06-08 Thread Valentino Giudice via Gcc
Thank you.

> Well there was an announcement; the changes in the mission statement reflect 
> the new reality introduced by that announcement:
>
> https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html
>
> Siddhesh

I was aware of that announcement, but it doesn't mention the mission
statement at all.
It appears that the decision in question was, at the time, in contrast
with the mission statement (rather than guided by it).

If the Steering Committee updates the mission statement, it may appear
that the mission statement follows the decisions of the steering
committee (in place of the contrary). In that case, what would be the
purpose of a mission statement?

The mission statement was also updated beyond simply making it
consistent with the change: in "Supporting the goals of the GNU
project, as defined by the FSF" the reference to the FSF was removed.

Was there any announcement about the update of the mission statement itself?
On what basis does the Steering Committee change the mission statement?


Re: GCC Mission Statement

2021-06-08 Thread Siddhesh Poyarekar

On 6/9/21 10:13 AM, Valentino Giudice via Gcc wrote:

Hi.

The Mission Statement of the GCC project recently changed without any
announcement.


Well there was an announcement; the changes in the mission statement 
reflect the new reality introduced by that announcement:


https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html

Siddhesh


[Bug c++/100983] New: Deduction guide for member template class rejected at class scope

2021-06-08 Thread brycelelbach at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100983

Bug ID: 100983
   Summary: Deduction guide for member template class rejected at
class scope
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: brycelelbach at gmail dot com
  Target Milestone: ---

```
struct X {
  template 
  struct Y {
template 
Y(Ts...) {}
  };

  template 
  Y(Ts...) -> Y;
};
```

I'm fairly confident this is legal code, but GCC rejects it, stating that a
deduction guide is only allowed at namespace scope.

http://eel.is/c++draft/temp.deduct.guide#3.sentence-4 says:

"A deduction-guide shall inhabit the scope to which the corresponding class
template belongs and, for a member class template, have the same access."

... which suggests to me that it is allowed.

https://godbolt.org/z/cWa69scjW

GCC Mission Statement

2021-06-08 Thread Valentino Giudice via Gcc
Hi.

The Mission Statement of the GCC project recently changed without any
announcement.

I am not a contributor to GCC, merely a user.
However, I'd like to understand more, especially about the
transparency of the project.

The GCC Steering Committee is supposed to follow the mission statement
as a guide for its decision.

Who changes the mission statement, and for what reason?
How can a modification of the statement be guided by the mission statement?
How were users and contributors informed of this?

Thank you in advance for your response.
Best regards.

For reference:
- The GCC homepage states the SC is "guided by the mission statement":
https://gcc.gnu.org/
- The mission statement before the update:
https://web.archive.org/web/20210331192925/https://gcc.gnu.org/gccmission.html


[Bug libstdc++/100982] New: wrong constraint in std::optional::operator=

2021-06-08 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100982

Bug ID: 100982
   Summary: wrong constraint in std::optional::operator=
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

There is a typo in optional#L818:

  template
enable_if_t<__and_v<__not_>,
is_constructible<_Tp, const _Up&>,
is_assignable<_Tp&, _Up>,
__not_<__converts_from_optional<_Tp, _Up>>,
__not_<__assigns_from_optional<_Tp, _Up>>>,
optional&>
operator=(const optional<_Up>& __u)

It should be is_assignable<_Tp&, const _Up&>.

https://godbolt.org/z/x7Gb9a5v9

#include 

struct U {};

struct T {
  explicit T(const U&);
  T& operator=(const U&);
  T& operator=(U&&) = delete;
};

int main() {
  std::optional opt1;
  std::optional opt2;
  opt2 = opt1;
}

[Bug target/100981] New: ICE in info_for_reduction, at tree-vect-loop.c:4897

2021-06-08 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100981

Bug ID: 100981
   Summary: ICE in info_for_reduction, at tree-vect-loop.c:4897
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---
Target: aarch64-linux-gnu

gfortran-12.0.0-alpha20210606 snapshot
(g:fed94fc9e704b0de228499495b7ca4d4c79ef76b) ICEs when compiling the following
testcase w/ -march=armv8.3-a -O3 -ftree-parallelize-loops=2 -fno-signed-zeros
-fno-trapping-math:

complex function cdcdot(n, cx)
  implicit none

  integer :: n, i, kx
  complex :: cx(*)
  double precision :: dsdotr, dsdoti, dt1, dt3

  kx = 1
  do i = 1, n
 dt1 = real(cx(kx))
 dt3 = aimag(cx(kx))
 dsdotr = dsdotr + dt1 * 2 - dt3 * 2
 dsdoti = dsdoti + dt1 * 2 + dt3 * 2
 kx = kx + 1
  end do
  cdcdot = cmplx(real(dsdotr), real(dsdoti))
  return
end function cdcdot

% aarch64-linux-gnu-gfortran-12.0.0 -march=armv8.3-a -O3
-ftree-parallelize-loops=2 -fno-signed-zeros -fno-trapping-math -c xrvsc8ow.f90
during GIMPLE pass: vect
xrvsc8ow.f90:9:8:

9 |   do i = 1, n
  |^
internal compiler error: in info_for_reduction, at tree-vect-loop.c:4897
0x7c8b0d info_for_reduction(vec_info*, _stmt_vec_info*)
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-loop.c:4897
0x122d008 vectorizable_live_operation(vec_info*, _stmt_vec_info*,
gimple_stmt_iterator*, _slp_tree*, _slp_instance*, int, bool,
vec*)
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-loop.c:8547
0x11ed1d7 can_vectorize_live_stmts
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-stmts.c:10619
0x1216858 vect_transform_stmt(vec_info*, _stmt_vec_info*,
gimple_stmt_iterator*, _slp_tree*, _slp_instance*)
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-stmts.c:11003
0x124b296 vect_schedule_slp_node
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-slp.c:6302
0x12596cc vect_schedule_scc
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-slp.c:6516
0x125a71f vect_schedule_slp(vec_info*, vec<_slp_instance*, va_heap, vl_ptr>)
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-slp.c:6580
0x1236e7c vect_transform_loop(_loop_vec_info*, gimple*)
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-loop.c:9538
0x1265f0f try_vectorize_loop_1
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vectorizer.c:1104
0x1266ca0 vectorize_loops()
   
/var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vectorizer.c:1243

Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-08 Thread Xionghu Luo via Gcc-patches



On 2021/6/9 04:11, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jun 04, 2021 at 09:40:58AM +0800, Xionghu Luo wrote:
 Combine still fail to merge the two instructions:

 Trying 6 -> 7:
   6: r120:KF#0=r125:KF#0<-<0x40
 REG_DEAD r125:KF
   7: [sfp:DI+r123:DI]=r120:KF#0<-<0x40
 REG_DEAD r120:KF
 Successfully matched this instruction:
 (set (mem/c:V1TI (plus:DI (reg/f:DI 110 sfp)
   (reg:DI 123)) [1  S16 A128])
   (subreg:V1TI (reg:KF 125) 0))
 rejecting combination of insns 6 and 7
 original costs 4 + 4 = 8
 replacement cost 12
>>>
>>> So what instructions were these?  Why did the store cost 4 but the new
>>> one costs 12?
> 
> The *vsx_le_perm_store_ instruction has the *preferred*
> alternative with cost 12, while the other alternative has cost 8.  Why
> is that?  That looks like a bug.
> (set_attr "length" "12,8")

12 was introduced by Mike's commit c477a6674364(r6-2577), and all the 5
vsx_le_perm_store_ are set to 12 for modes VSX_D/VSX_W/V8HI/V16QI
/VSX_LE_128, I guess it is split to two rs6000_emit_le_vsx_permute before
reload, but 3 rs6000_emit_le_vsx_permute after reload, so the length is
12, then it seems also not reasonable to change it from 12 to 8?  And I am
not sure when the alternative 1 will be chosen?

vsx.md:
;; The post-reload split requires that we re-permute the source
;; register in case it is still live.
(define_split
  [(set (match_operand:VSX_LE_128 0 "memory_operand")
(match_operand:VSX_LE_128 1 "vsx_register_operand"))]
  "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
   && !altivec_indexed_or_indirect_operand (operands[0], mode)"
  [(const_int 0)]
{
  rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
  rs6000_emit_le_vsx_permute (operands[0], operands[1], mode);
  rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
  DONE;
}) 

> 
 By hacking the vsx_le_perm_store_v1ti INSN_COST from 12 to 8,
>>>
>>> It should be the same cost as the other store!
>>
>> vsx_le_permute_v1ti's cost is defined to 4 in vsx.md:
> 
> Yes.  Why is alternative 0 of *vsx_le_perm_store_ said to have a
> length of 3 insns?
> 
> 
> Segher
> 

-- 
Thanks,
Xionghu


[Bug gcov-profile/100980] New: [GCOV]The assignment statement in the “for” structure caused the wrong coverage

2021-06-08 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100980

Bug ID: 100980
   Summary: [GCOV]The assignment statement in the “for” structure
caused the wrong coverage
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: njuwy at smail dot nju.edu.cn
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure -enable-checking=release -enable-languages=c,c++
-disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.0 (GCC) 

$ cat test.c
#include
extern void abort(void);
extern void exit(int);
int main(void) {
  struct foo{
 int i0;
};
  int b,c,d=1;
  for ((b = sizeof(struct foo {
  int i0;
  int i1;
}));
   d; d--)
c = sizeof(struct foo);
}

$ gcc -O0 --coverage test.c;./a.out;gcov test;cat test.c.gcov
File 'test.c'
Lines executed:100.00% of 5
Creating 'test.c.gcov'

-:0:Source:test.c
-:0:Graph:test.gcno
-:0:Data:test.gcda
-:0:Runs:1
-:1:#include
-:2:extern void abort(void);
-:3:extern void exit(int);
1:4:int main(void) {
-:5:  struct foo{
-:6: int i0;
-:7:};
1:8:  int b,c,d=1;
2:9:  for ((b = sizeof(struct foo {
-:   10:  int i0;
-:   11:  int i1;
-:   12:}));
1:   13:   d; d--)
1:   14:c = sizeof(struct foo);
-:   15:}


line 9 was wrongly marked as executed 2 times

Re: [PATCH v2] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-08 Thread Xionghu Luo via Gcc-patches



On 2021/6/9 05:07, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Jun 08, 2021 at 09:11:33AM +0800, Xionghu Luo wrote:
>> On P8LE, extra rot64+rot64 load or store instructions are generated
>> in float128 to vector __int128 conversion.
>>
>> This patch teaches pass swaps to also handle such pattens to remove
>> extra swap instructions.
> 
>> +/* Return 1 iff PAT is a rotate 64 bit expression; else return 0.  */
>> +
>> +static bool
>> +pattern_is_rotate64_p (rtx pat)
> 
> You already have a verb in the name, don't use _p please (and preferably
> just don't use it at all, "pattern_is_rotate64" is much better than
> "pattern_rotate64_p").
> 
>> +{
>> +  rtx rot = SET_SRC (pat);
> 
> So this is assuming PAT is a SINGLE_SET.  Please say that in the
> function comment.
> 
> /* Return 1 iff PAT (a SINGLE_SET) is a rotate 64 bit expression; else
> return 0.  */
> 
> You can do an assert for that as well, but I wouldn't bother.
> 
>> @@ -266,6 +280,9 @@ insn_is_load_p (rtx insn)
> 
> (I do realise you just copied existing naming, don't worry :-) )
> 
>> @@ -392,7 +411,8 @@ quad_aligned_load_p (swap_web_entry *insn_entry, 
>> rtx_insn *insn)
>>false.  */
>> rtx body = PATTERN (def_insn);
>> if (GET_CODE (body) != SET
>> -  || GET_CODE (SET_SRC (body)) != VEC_SELECT
>> +  || !(GET_CODE (SET_SRC (body)) == VEC_SELECT
>> +  || pattern_is_rotate64_p (body))
> 
> Broken indentation: the || should align with "pattern...".
> 
>> @@ -2223,9 +2246,9 @@ static void
>>   recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete)
>>   {
>> rtx body = PATTERN (insn);
>> -  gcc_assert (GET_CODE (body) == SET
>> -  && MEM_P (SET_DEST (body))
>> -  && GET_CODE (SET_SRC (body)) == VEC_SELECT);
>> +  gcc_assert (GET_CODE (body) == SET && MEM_P (SET_DEST (body))
>> +  && (GET_CODE (SET_SRC (body)) == VEC_SELECT
>> +  || pattern_is_rotate64_p (body)));
> 
> Please start a new line for every "&&" here.  The way it was was more
> readable.
> 
> It often is nice to keep things one one line, if it fits on one line.
> If it does not, make a new line for every phrase.  This is more readable
> because you can then just scan down the line of "&&" and see the start
> of every phrase without actually having to read it all.
> 
>> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-call.c 
>> b/gcc/testsuite/gcc.target/powerpc/float128-call.c
>> index 5895416e985..a1f09df8a57 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/float128-call.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/float128-call.c
>> @@ -21,5 +21,5 @@
>>   TYPE one (void) { return ONE; }
>>   void store (TYPE a, TYPE *p) { *p = a; }
>>   
>> -/* { dg-final { scan-assembler "lxvd2x 34"  } } */
>> -/* { dg-final { scan-assembler "stxvd2x 34" } } */
>> +/* { dg-final { scan-assembler "lvx 2"  } } */
>> +/* { dg-final { scan-assembler "stvx 2" } } */
> 
> Huh.  Is that correct?  Where did the other 32 loads and stores go?  Are
> there now other insns generated that you should scan for?

This is expected change. lxvd2x+xxpermdi is replaced by lvx.  No need scan other
instructions. Similarly for stvx. 34 and 2 are *vector register names* instead 
of
counts.

diff float128-call.trunk.s float128-call.patched.s
18,19c18
<   lxvd2x 34,0,9
<   xxpermdi 34,34,34,2
---
>   lvx 2,0,9
33,34c32
<   xxpermdi 34,34,34,2
<   stxvd2x 34,0,5
---
>   stvx 2,0,5

Thanks for all the other comments, updated and committed with r12-1316.


BR,
Xionghu


PING^3 [PATCH/RFC] combine: Tweak the condition of last_set invalidation

2021-06-08 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html

BR,
Kewen

on 2021/5/26 上午11:04, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> Gentle ping this:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html
> 
> BR,
> Kewen
> 
> on 2021/5/7 上午10:45, Kewen.Lin via Gcc-patches wrote:
>> Hi Segher,
>>

 I think this should be postponed to stage 1 though?  Or is there
 anything very urgent in it?

>>>
>>> Yeah, I agree that this belongs to stage1, and there isn't anything
>>> urgent about it.  Thanks for all further comments above!
>>>
>>
>> Gentle ping this:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html
>>
>> BR,
>> Kewen
>>


PING^2 [PATCH] rs6000: Support more short/char to float conversion

2021-06-08 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569792.html

BR,
Kewen

on 2021/5/26 上午11:02, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> Gentle ping this:
> 
>   https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569792.html
> 
> 
> BR,
> Kewen
> 
> on 2021/5/7 上午10:30, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> For some cases that when we load unsigned char/short values from
>> the appropriate unsigned char/short memories and convert them to
>> double/single precision floating point value, there would be
>> implicit conversions to int first.  It makes GCC not leverage the
>> P9 instructions lxsibzx/lxsihzx.  This patch is to add the related
>> define_insn_and_split to support this kind of scenario.
>>
>> Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
>> powerpc64-linux-gnu P8.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> --
>> gcc/ChangeLog:
>>
>>  * config/rs6000/rs6000.md
>>  (floatsi2_lfiwax__mem_zext): New
>>  define_insn_and_split.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/powerpc/p9-fpcvt-3.c: New test.
>>
> 


PING^1 [PATCH v2] rs6000: Add load density heuristic

2021-06-08 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571258.html

BR,
Kewen

on 2021/5/26 上午10:59, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> This is the updated version of patch to deal with the bwaves_r
> degradation due to vector construction fed by strided loads.
> 
> As Richi's comments [1], this follows the similar idea to over
> price the vector construction fed by VMAT_ELEMENTWISE or
> VMAT_STRIDED_SLP.  Instead of adding the extra cost on vector
> construction costing immediately, it firstly records how many
> loads and vectorized statements in the given loop, later in
> rs6000_density_test (called by finish_cost) it computes the
> load density ratio against all vectorized stmts, and check
> with the corresponding thresholds DENSITY_LOAD_NUM_THRESHOLD
> and DENSITY_LOAD_PCT_THRESHOLD, do the actual extra pricing
> if both thresholds are exceeded.
> 
> Note that this new load density heuristic check is based on
> some fields in target cost which are updated as needed when
> scanning each add_stmt_cost entry, it's independent of the
> current function rs6000_density_test which requires to scan
> non_vect stmts.  Since it's checking the load stmts count
> vs. all vectorized stmts, it's kind of density, so I put
> it in function rs6000_density_test.  With the same reason to
> keep it independent, I didn't put it as an else arm of the
> current existing density threshold check hunk or before this
> hunk.
> 
> In the investigation of -1.04% degradation from 526.blender_r
> on Power8, I noticed that the extra penalized cost 320 on one
> single vector construction with type V16QI is much exaggerated,
> which makes the final body cost unreliable, so this patch adds
> one maximum bound for the extra penalized cost for each vector
> construction statement.
> 
> Bootstrapped/regtested on powerpc64le-linux-gnu P9.
> 
> Full SPEC2017 performance evaluation on Power8/Power9 with
> option combinations:
>   * -O2 -ftree-vectorize {,-fvect-cost-model=very-cheap} {,-ffast-math}
>   * {-O3, -Ofast} {,-funroll-loops}
> 
> bwaves_r degradations on P8/P9 have been fixed, nothing else
> remarkable was observed.
> 
> Is it ok for trunk?
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570076.html
> 
> BR,
> Kewen
> -
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.c (struct rs6000_cost_data): New members
>   nstmts, nloads and extra_ctor_cost.
>   (rs6000_density_test): Add load density related heuristics and the
>   checks, do extra costing on vector construction statements if need.
>   (rs6000_init_cost): Init new members.
>   (rs6000_update_target_cost_per_stmt): New function.
>   (rs6000_add_stmt_cost): Factor vect_nonmem hunk out to function
>   rs6000_update_target_cost_per_stmt and call it.
> 



[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-06-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794

Kewen Lin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Kewen Lin  ---
Fixed on trunk, will continue to refactor the tree_predictive_commoning_loop
and its callees into class and member functions as suggested.

[Bug tree-optimization/100925] [12 Regression] tree check fail in make_range_step with -O1 in reassoc

2021-06-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100925

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2021-June/57
   ||2317.html
   Keywords||patch

--- Comment #7 from Andrew Pinski  ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572317.html

[PATCH 2/2] Disallow pointer and offset types on some gimple

2021-06-08 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

While debugging PR 100925, I found that the gimple verifiers
don't reject NEGATE on pointer or offset type.
This patch adds the check on some unary and binary gimple which
should not have operated on pointer/offset types.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* tree-cfg.c (verify_gimple_assign_unary): Reject point and offset
types on NEGATE_EXPR, ABS_EXPR, BIT_NOT_EXPR, PAREN_EXPR and CNONJ_EXPR.
(verify_gimple_assign_binary): Reject point and offset types on
MULT_EXPR, MULT_HIGHPART_EXPR, TRUNC_DIV_EXPR, CEIL_DIV_EXPR,
FLOOR_DIV_EXPR, ROUND_DIV_EXPR, TRUNC_MOD_EXPR, CEIL_MOD_EXPR,
FLOOR_MOD_EXPR, ROUND_MOD_EXPR, RDIV_EXPR, and EXACT_DIV_EXPR.
---
 gcc/tree-cfg.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 02256580c98..90fe4775405 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,15 @@ verify_gimple_assign_unary (gassign *stmt)
 case BIT_NOT_EXPR:
 case PAREN_EXPR:
 case CONJ_EXPR:
+  /* Disallow pointer and offset types for many of the unary gimple. */
+  if (POINTER_TYPE_P (lhs_type)
+ || TREE_CODE (lhs_type) == OFFSET_TYPE)
+   {
+ error ("invalid types for %qs", code_name);
+ debug_generic_expr (lhs_type);
+ debug_generic_expr (rhs1_type);
+ return true;
+   }
   break;
 
 case ABSU_EXPR:
@@ -4127,6 +4136,19 @@ verify_gimple_assign_binary (gassign *stmt)
 case ROUND_MOD_EXPR:
 case RDIV_EXPR:
 case EXACT_DIV_EXPR:
+  /* Disallow pointer and offset types for many of the binary gimple. */
+  if (POINTER_TYPE_P (lhs_type)
+ || TREE_CODE (lhs_type) == OFFSET_TYPE)
+   {
+ error ("invalid types for %qs", code_name);
+ debug_generic_expr (lhs_type);
+ debug_generic_expr (rhs1_type);
+ debug_generic_expr (rhs2_type);
+ return true;
+   }
+  /* Continue with generic binary expression handling.  */
+  break;
+
 case MIN_EXPR:
 case MAX_EXPR:
 case BIT_IOR_EXPR:
-- 
2.27.0



[PATCH 1/2] Fix PR 100925: Limit some a?CST1:CST2 optimizations to intergal types only

2021-06-08 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

The problem here is with offset (and pointer) types is we produce
a negative expression when this optimization hits.
It is easier to disable this optimization for all non-integeral types
instead of finding an integer type which is the same precission as the
type to do the negative expression on it.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/100925
* match.pd (a ? CST1 : CST2): Limit transformations
that would produce a negative to integeral types only.
Change !POINTER_TYPE_P to INTEGRAL_TYPE_P also.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr100925.C: New test.
---
 gcc/match.pd|  8 
 gcc/testsuite/g++.dg/torture/pr100925.C | 24 
 2 files changed, 28 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr100925.C

diff --git a/gcc/match.pd b/gcc/match.pd
index d06ff170684..bf22bc3a198 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3733,10 +3733,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (integer_onep (@1))
  (convert (convert:boolean_type_node @0)))
 /* a ? -1 : 0 -> -a. */
-(if (integer_all_onesp (@1))
+(if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
  (negate (convert (convert:boolean_type_node @0
 /* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */
-(if (!POINTER_TYPE_P (type) && integer_pow2p (@1))
+(if (INTEGRAL_TYPE_P (type) && integer_pow2p (@1))
  (with {
tree shift = build_int_cst (integer_type_node, tree_log2 (@1));
   }
@@ -3750,10 +3750,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (integer_onep (@2))
   (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
  /* a ? -1 : 0 -> -(!a). */
- (if (integer_all_onesp (@2))
+ (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
   (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } 

  /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
- (if (!POINTER_TYPE_P (type) && integer_pow2p (@2))
+ (if (INTEGRAL_TYPE_P (type) &&  integer_pow2p (@2))
   (with {
tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
}
diff --git a/gcc/testsuite/g++.dg/torture/pr100925.C 
b/gcc/testsuite/g++.dg/torture/pr100925.C
new file mode 100644
index 000..de13950dca0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr100925.C
@@ -0,0 +1,24 @@
+// { dg-do compile }
+
+struct QScopedPointerDeleter {
+  static void cleanup(int *);
+};
+class QScopedPointer {
+  typedef int *QScopedPointer::*RestrictedBool;
+
+public:
+  operator RestrictedBool() { return d ? nullptr : ::d; }
+  void reset() {
+if (d)
+  QScopedPointerDeleter::cleanup(d);
+  }
+  int *d;
+};
+class DOpenGLPaintDevicePrivate {
+public:
+  QScopedPointer fbo;
+} DOpenGLPaintDeviceresize_d;
+void DOpenGLPaintDeviceresize() {
+  if (DOpenGLPaintDeviceresize_d.fbo)
+DOpenGLPaintDeviceresize_d.fbo.reset();
+}
-- 
2.27.0



[PATCH] Improvements to fur_source interface class, enhanced stmt folding options.

2021-06-08 Thread Andrew MacLeod via Gcc-patches
I recently introduced the fur_source class as an intermediary between 
the Fold_Using_Ranges (FUR) class and where to pick up any ssa_names 
that it needs.    The initial idea was to abstract out a set of 
frequently changing parameters so the client fold routines wouldn't have 
to change every time we added a new way to do something with a 
statement.  Its also used by gori_compute when unwinding to allow for 
access to non-range-ops stmts when processing.


That said, I hadn't really formalized it, so fold_using_ranges was 
accessing its members frequently.  We have encountered an opportunity to 
add something else which is useful,. but where the internals should be 
hidden.


This patch a) formalizes the API (hiding the internals)  b) virtualizes 
the functions so we can use inheritance and not use conditions, and  c) 
adds the ability to pick up operands from a vector or list of ranges.


There is no real visual difference to consumers since its an interface 
layer they don't normally see.  The net effect is now there are multiple 
versions of fold_stmt that all behave quite nicely:


bool fold_range (irange , gimple *s, range_query *q = NULL);
bool fold_range (irange , gimple *s, edge on_edge, range_query *q = NULL);
bool fold_range (irange , gimple *s, irange );
bool fold_range (irange , gimple *s, irange , irange );
bool fold_range (irange , gimple *s, unsigned num_elements, irange 
*vector);


Now we can calculate ranges for a stmt,, ask for its range to be 
calculated as if it were on an edge, or we can supply one or more ranges 
to it to have the fold performed.  This latter set is akin to the old 
gimple_range_fold() routine we had, expect it only worked on a range-ops 
stmt, whereas this will work on any kind of stmt, including a PHI node.


The routines have all been enhanced so that if a range_query is not 
provided, it will invoke the default range_query.   It will also invoke 
the default query if a list of ranges is supplied and it requires 
additional ranges to resolve the stmt being queried.


There will probably be some additional tweaks going forward, especially 
since the list routines haven't really been tested. Aldy will be using 
them shortly, so that will be the test bed :-)


Performance is basically a wash since there is a slight overhead for the 
virtual function calls, but it is offset by no longer have to do any 
conditional checks in the get_operand() routine.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew



commit 87f9ac937d6cfd81cbbe0a43518ba10781888d7c
Author: Andrew MacLeod 
Date:   Tue Jun 8 15:43:03 2021 -0400

Virtualize fur_source and turn it into a proper API.

No more accessing the local info.  Also add fur_source/fold_stmt where ranges
are provided via being specified, or a vector to replace gimple_fold_range.

* gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Use a
fur_stmt source record.
* gimple-range.cc (fur_source::get_operand): Generic range query.
(fur_source::get_phi_operand): New.
(fur_source::register_dependency): New.
(fur_source::query): New.
(class fur_edge): New.  Edge source for operands.
(fur_edge::fur_edge): New.
(fur_edge::get_operand): New.
(fur_edge::get_phi_operand): New.
(fur_edge::query): New.
(fur_stmt::fur_stmt): New.
(fur_stmt::get_operand): New.
(fur_stmt::get_phi_operand): New.
(fur_stmt::query): New.
(class fur_depend): New.  Statement source and process dependencies.
(fur_depend::fur_depend): New.
(fur_depend::register_dependency): New.
(class fur_list): New.  List source for operands.
(fur_list::fur_list): New.
(fur_list::get_operand): New.
(fur_list::get_phi_operand): New.
(fold_range): New.  Instantiate appropriate fur_source class and fold.
(fold_using_range::range_of_range_op): Use new API.
(fold_using_range::range_of_address): Ditto.
(fold_using_range::range_of_phi): Ditto.
(imple_ranger::fold_range_internal): Use fur_depend class.
(fold_using_range::range_of_ssa_name_with_loop_info): Use new API.
* gimple-range.h (class fur_source): Now a base class.
(class fur_stmt): New.
(fold_range): New prototypes.
(fur_source::fur_source): Delete.

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 2c5360690db..09dcd694319 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -1008,7 +1008,7 @@ gori_compute::outgoing_edge_range_p (irange , edge e, tree name,
   if (!stmt)
 return false;
 
-  fur_source src (, NULL, e, stmt);
+  fur_stmt src (stmt, );
 
   // If NAME can be calculated on the edge, use that.
   if (is_export_p (name, e->src))
diff --git 

[Bug c++/100796] [11 Regression] GCC does not honor #pragma diagnostic ignored when using the integrated preprocessor

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100796

Jason Merrill  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #3 from Jason Merrill  ---
(In reply to Giuseppe D'Angelo from comment #0)
> Please advise as of what kind of test I could do
> / provide to help you track this one down. 

The testcase doesn't really need to be reduced, just separated from the Qt
build system: some source files and a compiler command line would be fine.

[PATCH] Use range based loops to iterate over vec<> in various places

2021-06-08 Thread Trevor Saunders
Hello,

This makes things a good bit shorter, and reduces complexity by removing
a bunch of index variables.

bootstrapped and regtested on x86_64-linux-gnu, ok?

Trev

gcc/analyzer/ChangeLog:

* call-string.cc (call_string::call_string): Iterate over vec<>
with range based for.
(call_string::operator=): Likewise.
(call_string::to_json): Likewise.
(call_string::hash): Likewise.
(call_string::calc_recursion_depth): Likewise.
* checker-path.cc (checker_path::fixup_locations): Likewise.
* constraint-manager.cc (equiv_class::equiv_class): Likewise.
(equiv_class::to_json): Likewise.
(equiv_class::hash): Likewise.
(constraint_manager::constraint_manager): Likewise.
(constraint_manager::operator=): Likewise.
(constraint_manager::hash): Likewise.
(constraint_manager::to_json): Likewise.
(constraint_manager::add_unknown_constraint): Likewise.
* engine.cc (impl_region_model_context::on_svalue_leak):
Likewise.
(on_liveness_change): Likewise.
(impl_region_model_context::on_unknown_change): Likewise.
* program-state.cc (extrinsic_state::to_json): Likewise.
(sm_state_map::set_state): Likewise.
* region-model.cc (make_test_compound_type): Likewise.
(test_canonicalization_4): Likewise.

gcc/ChangeLog:

* auto-profile.c (afdo_find_equiv_class): Iterate over vec<>
with range based for.
* cgraphclones.c (cgraph_node::create_clone): Likewise.
(cgraph_node::create_version_clone): Likewise.
* dwarf2out.c (output_call_frame_info): Likewise.
* gcc.c (do_specs_vec): Likewise.
(do_spec_1): Likewise.
(driver::set_up_specs): Likewise.
* gimple-loop-jam.c (any_access_function_variant_p): Likewise.
* ifcvt.c (cond_move_process_if_block): Likewise.
* ipa-modref.c (modref_lattice::add_escape_point): Likewise.
(analyze_parms): Likewise.
(modref_write_escape_summary): Likewise.
(update_escape_summary_1): Likewise.
* ipa-prop.h (ipa_copy_agg_values): Likewise.
(ipa_release_agg_values): Likewise.
* lower-subreg.c (decompose_multiword_subregs): Likewise.
* lto-streamer-out.c (DFS::DFS_write_tree_body): Likewise.
(hash_tree): Likewise.
(prune_offload_funcs): Likewise.
* sel-sched-dump.c (dump_insn_vector): Likewise.
* timevar.c (timer::named_items::print): Likewise.
* tree-cfgcleanup.c (cleanup_control_flow_pre): Likewise.
(cleanup_tree_cfg_noloop): Likewise.
* tree-data-ref.c (dump_data_references): Likewise.
(print_dir_vectors): Likewise.
(print_dist_vectors): Likewise.
(dump_data_dependence_relation): Likewise.
(dump_data_dependence_relations): Likewise.
(dump_dist_dir_vectors): Likewise.
(dump_ddrs): Likewise.
(prune_runtime_alias_test_list): Likewise.
(create_runtime_alias_checks): Likewise.
(free_subscripts): Likewise.
(save_dist_v): Likewise.
(save_dir_v): Likewise.
(invariant_access_functions): Likewise.
(same_access_functions): Likewise.
(access_functions_are_affine_or_constant_p): Likewise.
(compute_all_dependences): Likewise.
(find_data_references_in_stmt): Likewise.
(graphite_find_data_references_in_stmt): Likewise.
(free_dependence_relations): Likewise.
(free_data_refs): Likewise.
* tree-into-ssa.c (dump_currdefs): Likewise.
(rewrite_update_phi_arguments): Likewise.
* tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise.
* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
* tree-ssa-structalias.c (constraint_set_union): Likewise.
(merge_node_constraints): Likewise.
(move_complex_constraints): Likewise.
(do_deref): Likewise.
(get_constraint_for_address_of): Likewise.
(get_constraint_for_1): Likewise.
(process_all_all_constraints): Likewise.
(make_constraints_to): Likewise.
(handle_rhs_call): Likewise.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
Likewise.
(vect_slp_analyze_node_dependences): Likewise.
(vect_slp_analyze_instance_dependence): Likewise.
(vect_record_base_alignments): Likewise.
(vect_get_peeling_costs_all_drs): Likewise.
(vect_peeling_supportable): Likewise.
* tree-vectorizer.c (vec_info::~vec_info): Likewise.
(vec_info::free_stmt_vec_infos): Likewise.

gcc/c/ChangeLog:

* c-parser.c (c_parser_translation_unit): Iterate over vec<>
with range based for.
(c_parser_postfix_expression): Likewise.

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_call_expression): Iterate over vec<>
with range based for.
(cxx_eval_store_expression): Likewise.
 

[Bug c++/100838] [11 Regression] -fno-elide-constructors for C++14 missing required destructor call since r11-5872-g4eb28483004f8291

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100838

Jason Merrill  changed:

   What|Removed |Added

Summary|[11/12 Regression]  |[11 Regression]
   |-fno-elide-constructors for |-fno-elide-constructors for
   |C++14 missing required  |C++14 missing required
   |destructor call since   |destructor call since
   |r11-5872-g4eb28483004f8291  |r11-5872-g4eb28483004f8291

--- Comment #4 from Jason Merrill  ---
Fixed for 12 so far.

[Bug c++/89062] class template argument deduction failure with parentheses

2021-06-08 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89062

Marek Polacek  changed:

   What|Removed |Added

 CC||brycelelbach at gmail dot com

--- Comment #7 from Marek Polacek  ---
*** Bug 100979 has been marked as a duplicate of this bug. ***

[Bug c++/100979] Nested CTAD fails when the outer object is direct initialized and the inner object is list initialized

2021-06-08 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100979

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED
 CC||mpolacek at gcc dot gnu.org

--- Comment #1 from Marek Polacek  ---
I think it's a dup.

*** This bug has been marked as a duplicate of bug 89062 ***

[Bug c++/100879] [10/11/12 Regression] gcc is complaining of a signed compare when comparing enums of different types (same underlying type)

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100879

Jason Merrill  changed:

   What|Removed |Added

   Target Milestone|10.4|12.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Jason Merrill  ---
Fixed for GCC 12, thanks.

That the warning used -Wsign-compare seems to be because it was associated with
that option before -Wenum-compare was added, and never updated perhaps because
it was dead code for a long time.

[Bug c++/100879] [10/11/12 Regression] gcc is complaining of a signed compare when comparing enums of different types (same underlying type)

2021-06-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100879

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:087253b9951766cbd93286b804ebb1ab59197aa8

commit r12-1314-g087253b9951766cbd93286b804ebb1ab59197aa8
Author: Jason Merrill 
Date:   Tue Jun 8 17:48:49 2021 -0400

c++: remove redundant warning [PR100879]

Before my r277864, build_new_op promoted enums to int before passing them
on
to cp_build_binary_op; after that commit, it doesn't, so
warn_for_sign_compare sees the enum operands and gives a redundant warning.
This warning dates back to 1995, and seems to have been dead code for a
long
time--likely since build_new_op was added in 1997--so let's just remove it.

PR c++/100879

gcc/c-family/ChangeLog:

* c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch
warning.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/enum3.C: New test.

[pushed] c++: remove redundant warning [PR100879]

2021-06-08 Thread Jason Merrill via Gcc-patches
Before my r277864, build_new_op promoted enums to int before passing them on
to cp_build_binary_op; after that commit, it doesn't, so
warn_for_sign_compare sees the enum operands and gives a redundant warning.
This warning dates back to 1995, and seems to have been dead code for a long
time--likely since build_new_op was added in 1997--so let's just remove it.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/100879

gcc/c-family/ChangeLog:

* c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch
warning.

gcc/testsuite/ChangeLog:

* g++.dg/diagnostic/enum3.C: New test.
---
 gcc/c-family/c-warn.c   | 12 
 gcc/testsuite/g++.dg/diagnostic/enum3.C |  9 +
 2 files changed, 9 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/enum3.C

diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index a587b993fde..cd3c99ef4df 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -2240,18 +2240,6 @@ warn_for_sign_compare (location_t location,
   int op1_signed = !TYPE_UNSIGNED (TREE_TYPE (orig_op1));
   int unsignedp0, unsignedp1;
 
-  /* In C++, check for comparison of different enum types.  */
-  if (c_dialect_cxx()
-  && TREE_CODE (TREE_TYPE (orig_op0)) == ENUMERAL_TYPE
-  && TREE_CODE (TREE_TYPE (orig_op1)) == ENUMERAL_TYPE
-  && TYPE_MAIN_VARIANT (TREE_TYPE (orig_op0))
-!= TYPE_MAIN_VARIANT (TREE_TYPE (orig_op1)))
-{
-  warning_at (location,
- OPT_Wsign_compare, "comparison between types %qT and %qT",
- TREE_TYPE (orig_op0), TREE_TYPE (orig_op1));
-}
-
   /* Do not warn if the comparison is being done in a signed type,
  since the signed type will only be chosen if it can represent
  all the values of the unsigned type.  */
diff --git a/gcc/testsuite/g++.dg/diagnostic/enum3.C 
b/gcc/testsuite/g++.dg/diagnostic/enum3.C
new file mode 100644
index 000..d51aa8a0f70
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/enum3.C
@@ -0,0 +1,9 @@
+// PR c++/100879
+// { dg-additional-options -Werror=sign-compare }
+
+enum e1 { e1val };
+enum e2 { e3val };
+
+int main( int, char * [] ) {
+   if ( e1val == e3val ) return 1; // { dg-warning -Wenum-compare }
+}

base-commit: 61fc01806f376a780978a6dea165ec3dadef085b
-- 
2.27.0



[PATCH] c++: Failure to delay noexcept parsing with ptr-operator [PR100752]

2021-06-08 Thread Marek Polacek via Gcc-patches
We weren't passing 'flags' to the recursive call to cp_parser_declarator
in the ptr-operator case and as an effect, delayed parsing of noexcept
didn't work as advertised.  The following change passes more than just
CP_PARSER_FLAGS_DELAY_NOEXCEPT but that doesn't seem to break anything.

I'm not passing member_p because I don't need it and because it breaks
a few tests.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/branches?

PR c++/100752

gcc/cp/ChangeLog:

* parser.c (cp_parser_declarator): Pass flags down to
cp_parser_declarator.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept69.C: New test.
---
 gcc/cp/parser.c |  3 +--
 gcc/testsuite/g++.dg/cpp0x/noexcept69.C | 12 
 2 files changed, 13 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept69.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index d59a829d0b9..5930990ec1c 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -22066,8 +22066,7 @@ cp_parser_declarator (cp_parser* parser,
cp_parser_parse_tentatively (parser);
 
   /* Parse the dependent declarator.  */
-  declarator = cp_parser_declarator (parser, dcl_kind,
-CP_PARSER_FLAGS_NONE,
+  declarator = cp_parser_declarator (parser, dcl_kind, flags,
 /*ctor_dtor_or_conv_p=*/NULL,
 /*parenthesized_p=*/NULL,
 /*member_p=*/false,
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept69.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept69.C
new file mode 100644
index 000..9b87ba0cafb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept69.C
@@ -0,0 +1,12 @@
+// PR c++/100752
+// { dg-do compile { target c++11 } }
+
+struct S {
+  void f() noexcept {}
+  S () noexcept(noexcept(f())) { f(); return *this; }
+};
+
+struct X {
+  int& f() noexcept(noexcept(i));
+  int i;
+};

base-commit: c4574d23cb07340918793a5a98ae7bb2988b3791
-- 
2.31.1



[PATCH 3/3] Add IEEE 128-bit fp conditional move on PowerPC.

2021-06-08 Thread Michael Meissner via Gcc-patches
[PATCH 3/3] Add IEEE 128-bit fp conditional move on PowerPC.

This patch adds the support for power10 IEEE 128-bit floating point conditional
move and for automatically generating min/max.

In this patch, I simplified things compared to previous patches.  Instead of
allowing any four of the modes to be used for the conditional move comparison
and the move itself could use different modes, I restricted the conditional
move to just the same mode.  I.e. you can do:

_Float128 a, b, c, d, e, r;

r = (a == b) ? c : d;

But you can't do:

_Float128 c, d, r;
double a, b;

r = (a == b) ? c : d;

or:

_Float128 a, b;
double c, d, r;

r = (a == b) ? c : d;

This eliminates a lot of the complexity of the code, because you don't have to
worry about the sizes being different, and the IEEE 128-bit types being
restricted to Altivec registers, while the SF/DF modes can use any VSX
register.

I did not modify the existing support that allowed conditional moves where
SFmode operands are compared and DFmode operands are moved (and vice versa).

Compared to the May 18th patches, this patch replaces the complicated test that
was complained about.

I tested it on 3 platforms:

*   Power9 little endian, --with-code=power9;
*   Power8 big endian, --with-code=power8, both 32/64-bit tests done;
*   Power10 little endian, --with-code=power10.

All systems bootstrapped and there were no new regressions.  I believe I have
addressed the issues with the last patch.

Can I check this into the master branch, and after a soak-in period, back port
it to the GCC 11 branch?

gcc/
2021-06-08 Michael Meissner  

* config/rs6000/rs6000.c (rs6000_maybe_emit_fp_cmove): Add IEEE
128-bit floating point conditional move support.
(have_compare_and_set_mask): Add IEEE 128-bit floating point
types.
* config/rs6000/rs6000.md (movcc, IEEE128 iterator): New insn.
(movcc_p10, IEEE128 iterator): New insn.
(movcc_invert_p10, IEEE128 iterator): New insn.
(fpmask, IEEE128 iterator): New insn.
(xxsel, IEEE128 iterator): New insn.

gcc/testsuite/
2021-06-08  Michael Meissner  

* gcc.target/powerpc/float128-cmove.c: New test.
* gcc.target/powerpc/float128-minmax-3.c: New test.
---
 gcc/config/rs6000/rs6000.c|  38 ++-
 gcc/config/rs6000/rs6000.md   | 106 ++
 .../gcc.target/powerpc/float128-cmove.c   |  58 ++
 .../gcc.target/powerpc/float128-minmax-3.c|  15 +++
 4 files changed, 215 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1651788df6a..411e7539019 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15698,8 +15698,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
rtx op_false,
   return 1;
 }
 
-/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
-   minimum with "C" semantics.
+/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
+   maximum or minimum with "C" semantics.
 
Unless you use -ffast-math, you can't use these instructions to replace
conditions that implicitly reverse the condition because the comparison
@@ -15775,6 +15775,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx 
true_cond, rtx false_cond)
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
   rtx op1 = XEXP (op, 1);
+  machine_mode compare_mode = GET_MODE (op0);
   machine_mode result_mode = GET_MODE (dest);
   rtx compare_rtx;
   rtx cmove_rtx;
@@ -15783,6 +15784,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx 
true_cond, rtx false_cond)
   if (!can_create_pseudo_p ())
 return 0;
 
+  /* We allow the comparison to be either SFmode/DFmode and the true/false
+ condition to be either SFmode/DFmode.  I.e. we allow:
+
+   float a, b;
+   double c, d, r;
+
+   r = (a == b) ? c : d;
+
+and:
+
+   double a, b;
+   float c, d, r;
+
+   r = (a == b) ? c : d;
+
+but we don't allow intermixing the IEEE 128-bit floating point types with
+the 32/64-bit scalar types.
+
+It gets too messy where SFmode/DFmode can use any register and 
TFmode/KFmode
+can only use Altivec registers.  In addtion, we would need to do a XXPERMDI
+if we compare SFmode/DFmode and move TFmode/KFmode.  */
+
+  if (compare_mode == result_mode
+  || (compare_mode == SFmode && result_mode == DFmode)
+  || (compare_mode == DFmode && result_mode == SFmode))
+;
+  else
+return false;
+
   switch (code)
 {
 case EQ:
@@ -15835,6 +15865,10 @@ have_compare_and_set_mask (machine_mode mode)
 case E_DFmode:
   return TARGET_P9_MINMAX;
 
+case E_KFmode:
+case E_TFmode:
+  return TARGET_POWER10 && 

[PATCH 2/3] Fix IEEE 128-bit min/max test.

2021-06-08 Thread Michael Meissner via Gcc-patches
[PATCH 2/3] Fix IEEE 128-bit min/max test.

This patch fixes the float128-minmax.c test so that it can accommodate the
generation of xsmincqp and xsmaxcqp instructions on power10.  I changed
the effective target from 'float128' to 'ppc_float128_hw', since this
needs the IEEE 128-bit float hardware support.

I tested it on 3 platforms:

*   Power9 little endian, --with-code=power9;
*   Power8 big endian, --with-code=power8, both 32/64-bit tests done;
*   Power10 little endian, --with-code=power10.

All systems bootstrapped and there were no new regressions.  I believe I have
addressed the issues with the last patch.

Can I check this into the master branch, and after a soak-in period, back port
it to the GCC 11 branch?

gcc/testsuite/
2021-06-08  Michael Meissner  

* gcc.target/powerpc/float128-minmax.c: Adjust expected code for
power10.
* lib/target-supports.exp (check_effective_target_has_arch_pwr10):
New target support.
---
 gcc/testsuite/gcc.target/powerpc/float128-minmax.c |  8 +---
 gcc/testsuite/lib/target-supports.exp  | 10 ++
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c 
b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
index fe397518f2f..a7d3a3a0b3e 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
@@ -1,6 +1,5 @@
-/* { dg-do compile { target lp64 } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-require-effective-target float128 } */
+/* { dg-require-effective-target ppc_float128_hw } */
 /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
 
 #ifndef TYPE
@@ -12,5 +11,8 @@
 TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
 TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
 
-/* { dg-final { scan-assembler {\mxscmpuqp\M} } } */
+/* Adjust code power10 which has native min/max instructions.  */
+/* { dg-final { scan-assembler {\mxscmpuqp\M} { target { ! has_arch_pwr10 
} } } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} { target {   has_arch_pwr10 
} } } } */
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} { target {   has_arch_pwr10 
} } } } */
 /* { dg-final { scan-assembler-not {\mbl\M}   } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 7f78c5593ac..789723fb287 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6127,6 +6127,16 @@ proc check_effective_target_has_arch_pwr9 { } {
}]
 }
 
+proc check_effective_target_has_arch_pwr10 { } {
+   return [check_no_compiler_messages arch_pwr10 assembly {
+   #ifndef _ARCH_PWR10
+   #error does not have power10 support.
+   #else
+   /* "has power10 support" */
+   #endif
+   }]
+}
+
 # Return 1 if this is a PowerPC target supporting -mcpu=power10.
 # Limit this to 64-bit linux systems for now until other targets support
 # power10.
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH 1/3] Add IEEE 128-bit min/max support on PowerPC.

2021-06-08 Thread Michael Meissner via Gcc-patches
[PATCH 1/3] Add IEEE 128-bit min/max support on PowerPC.

This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions.  The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.

This patch does not try to re-use the code used for SF/DF min/max
support.  It defines a separate insn for the IEEE 128-bit support.  It
uses the code iterator  to simplify adding both operations.

GCC will not convert ternary operations into using min/max instructions
provided in this patch unless the user uses -Ofast or similar switches due to
issues with NaNs.  The next patch that adds conditional move instructions will
enable the ternary conversion in many cases.

Note the code for fixing float128-minmax.c has been moved to a separate
patch.

I tested it on 3 platforms:

*   Power9 little endian, --with-code=power9;
*   Power8 big endian, --with-code=power8, both 32/64-bit tests done;
*   Power10 little endian, --with-code=power10.

All systems bootstrapped and there were no new regressions.  I believe I have
addressed the issues with the last patch.

Can I check this into the master branch, and after a soak-in period, back port
it to the GCC 11 branch?

gcc/
2021-06-08  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp
instructions.
* config/rs6000/rs6000.md (s3, IEEE128 iterator):
New insns.

gcc/testsuite/
2021-06-08  Michael Meissner  

* gcc.target/powerpc/float128-minmax-2.c: New test.
---
 gcc/config/rs6000/rs6000.c|  3 ++-
 gcc/config/rs6000/rs6000.md   | 11 +++
 .../gcc.target/powerpc/float128-minmax-2.c| 15 +++
 3 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b01bb5c8191..1651788df6a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16103,7 +16103,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx 
op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
   && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
- || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode
+ || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+ || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode
 {
   emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
   return;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3f59b544f6a..064c3a2d9d6 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5214,6 +5214,17 @@ (define_insn "*s3_vsx"
 }
   [(set_attr "type" "fp")])
 
+;; Min/max for ISA 3.1 IEEE 128-bit floating point
+(define_insn "s3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+   (fp_minmax:IEEE128
+(match_operand:IEEE128 1 "altivec_register_operand" "v")
+(match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10"
+  "xscqp %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; The conditional move instructions allow us to perform max and min operations
 ;; even when we don't have the appropriate max/min instruction using the FSEL
 ;; instruction.
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c 
b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH 0/3] Add Power10 IEEE 128-bit min, max, conditional move

2021-06-08 Thread Michael Meissner via Gcc-patches
This is a revision of the patches I sent on May 18th.

I tested it on 3 platforms:

*   Power9 little endian, --with-code=power9;
*   Power8 big endian, --with-code=power8, both 32/64-bit tests done;
*   Power10 little endian, --with-code=power10.

All systems bootstrapped and there were no new regressions.  I believe I have
addressed the issues with the last patch.

The first patch in this set contains the same GCC code and new test as in the
previous patch, since I don't believe there was a problem with those bits.

I moved the changes for the existing test 'float128-minmax.c' to patch number
two.  Rather than using '#pragma GCC target' to force power9 code generation on
power10, instead I used conditional scan-assembler statements to deliniate the
power9 and power10 code generation.

The third patch of this set fixes the complicated test that was complained
about in the previous second patch.

Can I check these patches into the master branch.  Ideally, I think these
should go into GCC 11.2 after a soak-in period.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[Bug c++/100956] Unused variable warnings ignore "if constexpr" blocks where variables are conditionally used

2021-06-08 Thread mattreecebentley at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100956

--- Comment #2 from Matt Bentley  ---
Thank you - I'm aware GCC might optimize it out (and failed to test with
GCC10), at least in O2 mode, but other compilers might not, hence the code.

Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.

2021-06-08 Thread Michael Meissner via Gcc-patches
On Mon, Jun 07, 2021 at 05:31:50PM -0500, Segher Boessenkool wrote:
> On Tue, May 18, 2021 at 04:28:27PM -0400, Michael Meissner wrote:
> > In this patch, I simplified things compared to previous patches.  Instead of
> > allowing any four of the modes to be used for the conditional move 
> > comparison
> > and the move itself could use different modes, I restricted the conditional
> > move to just the same mode.  I.e. you can do:
> > 
> > _Float128 a, b, c, d, e, r;
> > 
> > r = (a == b) ? c : d;
> > 
> > But you can't do:
> > 
> > _Float128 c, d, r;
> > double a, b;
> > 
> > r = (a == b) ? c : d;
> > 
> > or:
> > 
> > _Float128 a, b;
> > double c, d, r;
> > 
> > r = (a == b) ? c : d;
> > 
> > This eliminates a lot of the complexity of the code, because you don't have 
> > to
> > worry about the sizes being different, and the IEEE 128-bit types being
> > restricted to Altivec registers, while the SF/DF modes can use any VSX
> > register.
> 
> You do not have to worry about that anyway.  You can just reuse the
> existing rs6000_maybe_emit_fp_cmove.  Or why not?  The IF_THEN_ELSE we
> generate there should work fine?
> 
> > +(define_expand "movcc"
> > +   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
> > +(if_then_else:IEEE128 (match_operand 1 "comparison_operator")
> > +  (match_operand:IEEE128 2 "gpc_reg_operand")
> > +  (match_operand:IEEE128 3 "gpc_reg_operand")))]
> > +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
> > +{
> > +  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], 
> > operands[3]))
> > +DONE;
> > +  else
> > +FAIL;
> > +})
> 
> Why is this a special pattern anyway?  Why can you not do
>   d = cond ? x : y;
> with cond any comparison, not even including any floating point
> possibly?

Well in theory you can certainly do this, we just need to add the necessary
code to implement it.  It quickly becomes an exponential cascading pattern,
where you have one set of modes for the comparison and a different set of modes
for the movement.

I've certainly seen instances where the code has an integer comparison and then
a FP move.  We can do this via a SETBC type instruction, direct move, and then
XXSEL.  But that is beyond the scope of this patch.

If you remember, the original form of this patch allowed the comparison to be
SF, DF, KF, and possibly TF, along with the move.  It becomes complicated when
you have to consider that SF/DF comparisons only fill the upper 64 bits of the
vector register with the comparison, and the IEEE 128-bit types need to be in
Altivec registers.

So I scaled back the patch to just allow 128-bit conditional move.  I left in
the existing 64/34-bit mixture because there was at least one benchmark it was
used in the past.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.

2021-06-08 Thread Michael Meissner via Gcc-patches
On Mon, Jun 07, 2021 at 03:25:06PM -0500, Segher Boessenkool wrote:
> On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote:
> > This patch adds the support for the IEEE 128-bit floating point C minimum 
> > and
> > maximum instructions.
> 
> > gcc/
> > 2021-05-18  Michael Meissner  
> > 
> > * config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> > 3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> > instructions.
> 
> 3.1 fits on the previous line (it is better to not split numbers to a
> new line).  What is up with the weird multiple spaces?  We don't align
> the right border in changelogs :-)
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> 
> Is this needed?  And, why is ppc_float128_hw needed?  That combination
> does not seem to make sense.

Basically it is there to make sure that we are actually generating IEEE 128-bit
instructions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] rs6000: Remove unspecs for vec_mrghl[bhw]

2021-06-08 Thread Segher Boessenkool
On Mon, May 24, 2021 at 04:02:13AM -0500, Xionghu Luo wrote:
> vmrghb only accepts permute index {0, 16, 1, 17, 2, 18, 3, 19, 4, 20,
> 5, 21, 6, 22, 7, 23} no matter for BE or LE in ISA, similarly for vmrghlb.

(vmrglb)

> +  if (BYTES_BIG_ENDIAN)
> +emit_insn (
> +  gen_altivec_vmrghb_direct (operands[0], operands[1], operands[2]));
> +  else
> +emit_insn (
> +  gen_altivec_vmrglb_direct (operands[0], operands[2], operands[1]));

Please don't indent like that, it doesn't match what we do elsewhere.
For better or for worse (for worse imo), we use deep hanging indents.
If you have to, you can do something like

  rtx insn;
  if (BYTES_BIG_ENDIAN)
insn = gen_altivec_vmrghb_direct (operands[0], operands[1], operands[2]);
  else
insn = gen_altivec_vmrglb_direct (operands[0], operands[2], operands[1]);
  emit_insn (insn);

(this is better even, in that it has only one emit_insn), or even

  rtx (*fun) () = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
   : gen_altivec_vmrglb_direct;
  if (!BYTES_BIG_ENDIAN)
std::swap (operands[1], operands[2]);
  emit_insn (fun (operands[0], operands[1], operands[2]));

Well, C++ does not allow that last example like that, sigh, so
  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
: gen_altivec_vmrglb_direct;

This is shorter than the other two options ;-)

> +(define_insn "altivec_vmrghb_direct"
>[(set (match_operand:V16QI 0 "register_operand" "=v")
> +(vec_select:V16QI

This should be indented one space more.

>"TARGET_ALTIVEC"
>"@
> -   xxmrghw %x0,%x1,%x2
> -   vmrghw %0,%1,%2"
> +  xxmrghw %x0,%x1,%x2
> +  vmrghw %0,%1,%2"

The original indent was correct, please restore.

> -  emit_insn (gen_altivec_vmrghw_direct (operands[0], ve, vo));
> +  emit_insn (gen_altivec_vmrghw_direct_v4si (operands[0], ve, vo));

When you see a mode as part of a pattern name, chances are that it will
be a good candidate for using parameterized names with.  (But don't do
that now, just keep it in mind as a nice cleanup to do).

> @@ -23022,8 +23022,8 @@ altivec_expand_vec_perm_const (rtx target, rtx op0, 
> rtx op1,
> : CODE_FOR_altivec_vmrglh_direct),
>{  0,  1, 16, 17,  2,  3, 18, 19,  4,  5, 20, 21,  6,  7, 22, 23 } },
>  { OPTION_MASK_ALTIVEC,
> -  (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw_direct
> -   : CODE_FOR_altivec_vmrglw_direct),
> +  (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw_direct_v4si
> +   : CODE_FOR_altivec_vmrglw_direct_v4si),

The correct way is to align the ? and the : (or put everything on one
line of course, if that fits)

The parens around this are not needed btw, and are a distraction.

> --- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
> @@ -317,10 +317,10 @@ int main ()
>  /* { dg-final { scan-assembler-times "vctuxs" 2 } } */
>  
>  /* { dg-final { scan-assembler-times "vmrghb" 4 { target be } } } */
> -/* { dg-final { scan-assembler-times "vmrghb" 5 { target le } } } */
> +/* { dg-final { scan-assembler-times "vmrghb" 6 { target le } } } */
>  /* { dg-final { scan-assembler-times "vmrghh" 8 } } */
> -/* { dg-final { scan-assembler-times "xxmrghw" 8 } } */
> -/* { dg-final { scan-assembler-times "xxmrglw" 8 } } */
> +/* { dg-final { scan-assembler-times "xxmrghw" 4 } } */
> +/* { dg-final { scan-assembler-times "xxmrglw" 4 } } */
>  /* { dg-final { scan-assembler-times "vmrglh" 8 } } */
>  /* { dg-final { scan-assembler-times "xxlnor" 6 } } */
>  /* { dg-final { scan-assembler-times {\mvpkudus\M} 1 } } */
> @@ -347,7 +347,7 @@ int main ()
>  /* { dg-final { scan-assembler-times "vspltb" 6 } } */
>  /* { dg-final { scan-assembler-times "vspltw" 0 } } */
>  /* { dg-final { scan-assembler-times "vmrgow" 8 } } */
> -/* { dg-final { scan-assembler-times "vmrglb" 5 { target le } } } */
> +/* { dg-final { scan-assembler-times "vmrglb" 4 { target le } } } */
>  /* { dg-final { scan-assembler-times "vmrglb" 6 { target be } } } */
>  /* { dg-final { scan-assembler-times "vmrgew" 8 } } */
>  /* { dg-final { scan-assembler-times "vsplth" 8 } } */

Are those changes correct?  It looks like a vmrglb became a vmrghb, and
that 4 each of xxmrghw and xxmrglw disappeared?  Both seem wrong?


Segher


[PATCH] Always enable DT_INIT_ARRAY/DT_FINI_ARRAY on Linux

2021-06-08 Thread H.J. Lu via Gcc-patches
DT_INIT_ARRAY/DT_FINI_ARRAY support was added to glibc by

commit fcf70d4114db9ff7923f5dfeb3fea6e2d623e5c2
Author: Ulrich Drepper 
Date:   Sat Jul 24 19:45:13 1999 +

Update.

1999-07-24  Ulrich Drepper  

* elf/dl-fini.c: Handle DT_FINI_ARRAY.
* elf/link.h (struct link_map): Remove l_init_running.  Add 
l_runcount
and l_initcount.
* elf/dl-init.c: Handle DT_INIT_ARRAY.
...

PR target/100896
* config.gcc (gcc_cv_initfini_array): Set to yes for Linux and
GNU targets.
---
 gcc/config.gcc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 6833a6c13d9..4dc4fe0b65c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -848,6 +848,8 @@ case ${target} in
   tmake_file="${tmake_file} t-glibc"
   target_has_targetcm=yes
   target_has_targetdm=yes
+  # Linux targets always support .init_array.
+  gcc_cv_initfini_array=yes
   ;;
 *-*-netbsd*)
   tm_p_file="${tm_p_file} netbsd-protos.h"
-- 
2.31.1



[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait

2021-06-08 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889

Thomas Rodgers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Thomas Rodgers  ---
Fixed in master, backported to releases/gcc-11

[Bug tree-optimization/25290] PHI-OPT could be rewritten so that is uses match

2021-06-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25290

--- Comment #22 from Andrew Pinski  ---
Without load/store handling, here are the following optimizations that either
can move to match.pd already or need some extra work to do it:

* value_replacement: need to handle !single_non_singleton_phi_for_edges case
and more than one feeder statement (2 max according to the current definition)
* cond_removal_in_popcount_clz_ctz_pattern: need 2 feeder statements and
builtin call handling for feeder statements


* two_value_replacement: recored as PR 100958, it can move already
* abs_replacement: needs PROP_gimple_lswitch so we don't change if statements
early enough
** I think majority of the abs handling is already in match.pd.
* minmax_replacement: has some handling of comparisions which might not be in
the match.pd patterns already.  needs PROP_gimple_lswitch also.
** The handling of:
 if (a <= u)
   b = MAX (a, d);
 x = PHI 
   needs to moved too.


For the ones which cannot move
* factor_out_conditional_conversion: will never move, though it needs
improvement and moved already (PR 56223 and PR 13563)
* spaceship_replacement: cannot move to match.pd depends on use afterwards
which is not hard to deal with in a match pattern.

Re: [PATCH] [libstdc++] Remove unused hasher instance.

2021-06-08 Thread Thomas Rodgers via Gcc-patches
Tested x86_64-pc-linux-gnu, committed to master, backported to
releases/gcc-11.

On Fri, Jun 4, 2021 at 1:30 PM Jonathan Wakely  wrote:

>
>
> On Fri, 4 Jun 2021 at 20:54, Thomas Rodgers wrote:
>
>> This is a remnant of poorly executed refactoring.
>>
>
> OK for trunk and gcc-11, thanks.
>
>
>
>> libstdc++-v3/ChangeLog:
>>
>> * include/std/barrier (__tree_barrier::_M_arrive): Remove
>> unnecessary hasher instantiation.
>> ---
>>  libstdc++-v3/include/std/barrier | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/libstdc++-v3/include/std/barrier
>> b/libstdc++-v3/include/std/barrier
>> index fd61fb4f9da..4210e30d1ce 100644
>> --- a/libstdc++-v3/include/std/barrier
>> +++ b/libstdc++-v3/include/std/barrier
>> @@ -103,7 +103,6 @@ It looks different from literature pseudocode for two
>> main reasons:
>>static_cast<__barrier_phase_t>(__old_phase_val
>> + 2);
>>
>> size_t __current_expected = _M_expected;
>> -   std::hash __hasher;
>> __current %= ((_M_expected + 1) >> 1);
>>
>> for (int __round = 0; ; ++__round)
>> --
>> 2.26.2
>>
>>


Re: [PATCH] libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889]

2021-06-08 Thread Thomas Rodgers via Gcc-patches
Tested x86_64-pc-linux-gnu, committed to master, backported to
releases/gcc-11.

On Tue, Jun 8, 2021 at 8:44 AM Jonathan Wakely  wrote:

> On Tue, 8 Jun 2021 at 01:29, Thomas Rodgers wrote:
>
>> This time without the repeatred [PR] in the subject line.
>>
>> Fixes libstdc++/100889
>>
>
> This should be part of the ChangeLog entry instead, preceded by PR so it
> updates bugzilla, i.e.
>
>
>
>> libstdc++-v3/ChangeLog:
>>
>
> PR libstdc++/100889
>
>
>> * include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
>> Change parameter type from _Tp to _Tp*.
>> * testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend
>> coverage of types tested.
>>
>
>
> OK for trunk and gcc-11 with that change, thanks.
>
>
>
>


[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait

2021-06-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889

--- Comment #3 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Thomas Rodgers
:

https://gcc.gnu.org/g:d7462945387b33744f665d1aa33ba1cec79c03b0

commit r11-8528-gd7462945387b33744f665d1aa33ba1cec79c03b0
Author: Thomas Rodgers 
Date:   Tue Jun 8 15:51:53 2021 -0700

libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889]

libstdc++-v3/ChangeLog:

PR libstdc++/100889
* include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
Change parameter type from _Tp to _Tp*.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend
coverage of types tested.

(cherry picked from commit 25e5ecdf82b49977e86bfaded236fb34af2705ed)

[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait

2021-06-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Thomas Rodgers :

https://gcc.gnu.org/g:25e5ecdf82b49977e86bfaded236fb34af2705ed

commit r12-1312-g25e5ecdf82b49977e86bfaded236fb34af2705ed
Author: Thomas Rodgers 
Date:   Tue Jun 8 15:51:53 2021 -0700

libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889]

libstdc++-v3/ChangeLog:

PR libstdc++/100889
* include/bits/atomic_base.h (atomic_ref<_Tp*>::wait):
Change parameter type from _Tp to _Tp*.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend
coverage of types tested.

[Bug tree-optimization/25290] PHI-OPT could be rewritten so that is uses match

2021-06-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25290

--- Comment #21 from Andrew Pinski  ---
Note this is not fully fixed, there is still some more work to do to deal with
the non single_non_singleton_phi_for_edges case which will allow to get rid of
value_replacement.

Note to get rid of early_p check and abs_replacement, we need to add
PROP_gimple_lswitch to say we have lowered switches already.

[Bug c++/100979] New: Nested CTAD fails when the outer object is direct initialized and the inner object is list initialized

2021-06-08 Thread brycelelbach at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100979

Bug ID: 100979
   Summary: Nested CTAD fails when the outer object is direct
initialized and the inner object is list initialized
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: brycelelbach at gmail dot com
  Target Milestone: ---

template  struct X { X(T t) {} };

int main() {
  auto t00 = X(1);
  auto t01 = X{1};
  X t02{1};
  X t03(1);

  auto t04 = X(X{1});
  auto t05 = X{X{1}};
  auto t06 = X(X(1));
  auto t07 = X{X(1)};
  X t08(X{1}); // GCC 11.x and up rejects this; MSVC and Clang accept it.
  X t09{X{1}};
  X t10(X(1));
  X t11{X(1)};
}

https://godbolt.org/z/Pbx6cjE7q

[Bug c++/100065] Conditional explicit doesn't work for deduction guide

2021-06-08 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100065

--- Comment #3 from Marek Polacek  ---
Fixed on trunk so far, will backport.

[Bug c++/100065] Conditional explicit doesn't work for deduction guide

2021-06-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100065

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:1afa4facb9348cac0349ff9c30066aa25a3608f7

commit r12-1310-g1afa4facb9348cac0349ff9c30066aa25a3608f7
Author: Marek Polacek 
Date:   Mon Jun 7 16:06:00 2021 -0400

c++: explicit() ignored on deduction guide [PR100065]

When we have explicit() with a value-dependent argument, we can't
evaluate it at parsing time, so cp_parser_function_specifier_opt stashes
the argument into the decl-specifiers and grokdeclarator then stores it
into explicit_specifier_map, which is then used when substituting the
function decl.  grokdeclarator stores it for constructors and conversion
functions, but we also need to do it for deduction guides, otherwise
we'll forget that we've seen an explicit-specifier as in the attached
test.

PR c++/100065

gcc/cp/ChangeLog:

* decl.c (grokdeclarator): Store a value-dependent
explicit-specifier even for deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/explicit18.C: New test.

[Bug tree-optimization/25290] PHI-OPT could be rewritten so that is uses match

2021-06-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25290

--- Comment #20 from CVS Commits  ---
The master branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:c4574d23cb07340918793a5a98ae7bb2988b3791

commit r12-1309-gc4574d23cb07340918793a5a98ae7bb2988b3791
Author: Andrew Pinski 
Date:   Tue Jun 1 06:48:05 2021 +

Improve match_simplify_replacement in phi-opt

This improves match_simplify_replace in phi-opt to handle the
case where there is one cheap (non-call) preparation statement in the
middle basic block similar to xor_replacement and others.
This allows to remove xor_replacement which it does too.

OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

Changes since v1:
v3 - Just minor changes to using gimple_assign_lhs
instead of gimple_lhs and fixing a comment.
v2 - change the check on the preparation statement to
allow only assignments and no calls and only assignments
that feed into the phi.

gcc/ChangeLog:

PR tree-optimization/25290
* tree-ssa-phiopt.c (xor_replacement): Delete.
(tree_ssa_phiopt_worker): Delete use of xor_replacement.
(match_simplify_replacement): Allow one cheap preparation
statement that can be moved to before the if.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr96928-1.c: Fix testcase for now that ~
happens on the outside of the bit_xor.

[commited] Improve match_simplify_replacement in phi-opt

2021-06-08 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

This improves match_simplify_replace in phi-opt to handle the
case where there is one cheap (non-call) preparation statement in the
middle basic block similar to xor_replacement and others.
This allows to remove xor_replacement which it does too.

OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Committed as pre-approved.

Thanks,
Andrew Pinski

Changes since v1:
v3 - Just minor changes to using gimple_assign_lhs
instead of gimple_lhs and fixing a comment.
v2 - change the check on the preparation statement to
allow only assignments and no calls and only assignments
that feed into the phi.

gcc/ChangeLog:

PR tree-optimization/25290
* tree-ssa-phiopt.c (xor_replacement): Delete.
(tree_ssa_phiopt_worker): Delete use of xor_replacement.
(match_simplify_replacement): Allow one cheap preparation
statement that can be moved to before the if.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr96928-1.c: Fix testcase for now that ~
happens on the outside of the bit_xor.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c |   4 +-
 gcc/tree-ssa-phiopt.c | 164 +++---
 2 files changed, 54 insertions(+), 114 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
index a2770e5e896..2e86620da11 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
@@ -1,9 +1,9 @@
 /* PR tree-optimization/96928 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-phiopt2" } */
+/* { dg-options "-O2 -fdump-tree-phiopt2 -fdump-tree-optimized" } */
 /* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 "phiopt2" } 
} */
 /* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 "phiopt2" } } 
*/
-/* { dg-final { scan-tree-dump-times " = ~" 1 "phiopt2" } } */
+/* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
 /* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 
"phiopt2" } } */
 /* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */
 
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index 969b868397e..76f4e7ec843 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfghooks.h"
 #include "tree-pass.h"
 #include "ssa.h"
+#include "tree-ssa.h"
 #include "optabs-tree.h"
 #include "insn-config.h"
 #include "gimple-pretty-print.h"
@@ -63,8 +64,6 @@ static bool minmax_replacement (basic_block, basic_block,
edge, edge, gphi *, tree, tree);
 static bool abs_replacement (basic_block, basic_block,
 edge, edge, gphi *, tree, tree);
-static bool xor_replacement (basic_block, basic_block,
-edge, edge, gphi *, tree, tree);
 static bool spaceship_replacement (basic_block, basic_block,
   edge, edge, gphi *, tree, tree);
 static bool cond_removal_in_popcount_clz_ctz_pattern (basic_block, basic_block,
@@ -352,9 +351,6 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
do_hoist_loads, bool early_p)
cfgchanged = true;
  else if (abs_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
cfgchanged = true;
- else if (!early_p
-  && xor_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
-   cfgchanged = true;
  else if (!early_p
   && cond_removal_in_popcount_clz_ctz_pattern (bb, bb1, e1,
e2, phi, arg0,
@@ -801,14 +797,51 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
   edge true_edge, false_edge;
   gimple_seq seq = NULL;
   tree result;
-
-  if (!empty_block_p (middle_bb))
-return false;
+  gimple *stmt_to_move = NULL;
 
   /* Special case A ? B : B as this will always simplify to B. */
   if (operand_equal_for_phi_arg_p (arg0, arg1))
 return false;
 
+  /* If the basic block only has a cheap preparation statement,
+ allow it and move it once the transformation is done. */
+  if (!empty_block_p (middle_bb))
+{
+  stmt_to_move = last_and_only_stmt (middle_bb);
+  if (!stmt_to_move)
+   return false;
+
+  if (gimple_vuse (stmt_to_move))
+   return false;
+
+  if (gimple_could_trap_p (stmt_to_move)
+ || gimple_has_side_effects (stmt_to_move))
+   return false;
+
+  if (gimple_uses_undefined_value_p (stmt_to_move))
+   return false;
+
+  /* Allow assignments and not no calls.
+As const calls don't match any of the above, yet they could
+still have some side-effects - they could contain
+gimple_could_trap_p statements, like floating point
+exceptions or integer division by zero.  See PR70586.
+FIXME: perhaps gimple_has_side_effects or gimple_could_trap_p
+should handle 

[Bug middle-end/54400] recognize vector reductions

2021-06-08 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54400

--- Comment #8 from Marc Glisse  ---
(In reply to Richard Biener from comment #7)
> (note avoiding hadd in the reduc pattern was intended).

Indeed. Except with -Os, or if a processor with a fast hadd appears,
vectorising this doesn't bring anything. It doesn't hurt either though.

[Bug rtl-optimization/80770] suboptimal code negating a 1-bit _Bool field

2021-06-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80770

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Severity|normal  |enhancement
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #2 from Andrew Pinski  ---
Mine.

Though if we lower, we will still need to optimize the following on the gimple
level:
  _1 = BIT_FIELD_REF <_6, 1, 0>;
  _2 = ~_1;
  _8 = BIT_INSERT_EXPR <_6, _2, 0 (1 bits)>;

to _8 = _6 ^ 1;

Or in general:
BIT_INSERT_EXPR <_6, bit_not (BIT_FIELD_REF <_6, bits, shift>), shift (bits)>
to
_6 ^ shiftedmask(bits, shift);

And maybe add:
BIT_INSERT_EXPR <_6, bit_op (BIT_FIELD_REF <_6, bits, shift>, B), shift (bits)>

_6 bit_op (convert (convert:u B) << shift);
Where u is the unsigned type if B is not an unsigned type.

[Bug c++/100879] [10/11/12 Regression] gcc is complaining of a signed compare when comparing enums of different types (same underlying type)

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100879

Jason Merrill  changed:

   What|Removed |Added

   Last reconfirmed||2021-06-08
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 CC||jason at gcc dot gnu.org
 Ever confirmed|0   |1

[PATCH 1/2] arm: Fix vcond_mask expander for MVE (PR target/100757)

2021-06-08 Thread Christophe Lyon via Gcc-patches
The problem in this PR is that we call VPSEL with a mask of vector
type instead of HImode. This happens because operand 3 in vcond_mask
is the pre-computed vector comparison and has vector type. The fix is
to transfer this value to VPR.P0 by comparing operand 3 with a vector
of constant 1 of the same type as operand 3.

The pr100757*.c testcases are derived from
gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
different types and return values different from 0 and 1 to avoid
commonalization with boolean masks.

Reducing the number of iterations in pr100757-3.c from 32 to 8, we
generate the code below:

float a[32];
float fn1(int d) {
  int c = 4;
  for (int b = 0; b < 8; b++)
if (a[b] != 2.0f)
  c = 5;
  return c;
}

fn1:
ldr r3, .L4+80
vpush.64{d8, d9}
vldrw.32q3, [r3]// q3=a[0..3]
vldr.64 d8, .L4 // q4=(2.0,2.0,2.0,2.0)
vldr.64 d9, .L4+8
addsr3, r3, #16
vcmp.f32eq, q3, q4  // cmp a[0..3] == (2.0,2.0,2.0,2.0)
vldr.64 d2, .L4+16  // q1=(1,1,1,1)
vldr.64 d3, .L4+24
vldrw.32q3, [r3]// q3=a[4..7]
vldr.64 d4, .L4+32  // q2=(0,0,0,0)
vldr.64 d5, .L4+40
vpsel q0, q1, q2// q0=select (a[0..3])
vcmp.f32eq, q3, q4  // cmp a[4..7] == (2.0,2.0,2.0,2.0)
vldmsp!, {d8-d9}
vpsel q2, q1, q2// q2=select (a[4..7])
vandq2, q0, q2  // q2=select (a[0..3]) && select 
(a[4..7])
vldr.64 d6, .L4+48  // q3=(4.0,4.0,4.0,4.0)
vldr.64 d7, .L4+56
vldr.64 d0, .L4+64  // q0=(5.0,5.0,5.0,5.0)
vldr.64 d1, .L4+72
vcmp.i32  eq, q2, q1// cmp mask(a[0..7]) == (1,1,1,1)
vpsel q3, q3, q0// q3= vcond_mask(4.0,5.0)
vmov.32 r3, q3[0]   // keep the scalar max
vmov.32 r1, q3[1]
vmov.32 r0, q3[3]
vmov.32 r2, q3[2]
vmovs14, r1
vmovs15, r3
vmaxnm.f32  s15, s15, s14
vmovs14, r2
vmaxnm.f32  s15, s15, s14
vmovs14, r0
vmaxnm.f32  s15, s15, s14
vmovr0, s15
bx  lr
.L5:
.align  3
.L4:
.word   1073741824
.word   1073741824
.word   1073741824
.word   1073741824
.word   1
.word   1
.word   1
.word   1
.word   0
.word   0
.word   0
.word   0
.word   1082130432
.word   1082130432
.word   1082130432
.word   1082130432
.word   1084227584
.word   1084227584
.word   1084227584
.word   1084227584

2021-06-09  Christophe Lyon  

PR target/100757
gcc/
* config/arm/vec-common.md (vcond_mask_): Fix
expansion for MVE.

gcc/testsuite/
* gcc.target/arm/simd/pr100757.c: New test.
* gcc.target/arm/simd/pr100757-2.c: New test.
* gcc.target/arm/simd/pr100757-3.c: New test.
---
 gcc/config/arm/vec-common.md  | 24 +--
 .../gcc.target/arm/simd/pr100757-2.c  | 20 
 .../gcc.target/arm/simd/pr100757-3.c  | 20 
 gcc/testsuite/gcc.target/arm/simd/pr100757.c  | 19 +++
 4 files changed, 81 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c

diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index 0ffc7a9322c..ccdfaa8321f 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -478,8 +478,28 @@ (define_expand "vcond_mask_"
 }
   else if (TARGET_HAVE_MVE)
 {
-  emit_insn (gen_mve_vpselq (VPSELQ_S, mode, operands[0],
- operands[1], operands[2], operands[3]));
+  /* Convert pre-computed vector comparison into VPR.P0 by comparing
+ operand 3 with a vector of '1', then use VPSEL.  */
+  machine_mode cmp_mode = GET_MODE (operands[3]);
+  rtx vpr_p0 = gen_reg_rtx (HImode);
+  rtx one = gen_reg_rtx (cmp_mode);
+  emit_move_insn (one, CONST1_RTX (cmp_mode));
+  emit_insn (gen_mve_vcmpq (EQ, cmp_mode, vpr_p0, operands[3], one));
+
+  switch (GET_MODE_CLASS (mode))
+{
+  case MODE_VECTOR_INT:
+emit_insn (gen_mve_vpselq (VPSELQ_S, mode, operands[0], 
operands[1], operands[2], vpr_p0));
+break;
+  case MODE_VECTOR_FLOAT:
+   if (TARGET_HAVE_MVE_FLOAT)
+  emit_insn (gen_mve_vpselq_f (mode, operands[0], 
operands[1], operands[2], vpr_p0));
+   else
+ gcc_unreachable ();
+break;
+  default:
+ 

[PATCH 2/2] arm: Fix fix arm_expand_vcond for MVE

2021-06-08 Thread Christophe Lyon via Gcc-patches
This patch fixes a problem in arm_expand_vcond() where the result
would be a vector of 0 or 1 instead of operand 1 or 2.  The
mve-vcmp-f32-2.c testcase is an update from mve-vcmp-f32.c using a
conditional with 2.0f and 3.0f constants to help scan-assembler-times.

2021-06-09  Christophe Lyon  

gcc/
* config/arm/arm.c (arm_expand_vcond): Fix select operands.

gcc/testsuite/
* gcc.target/arm/simd/mve-vcmp-f32-2.c: New test.
---
 gcc/config/arm/arm.c  | 15 +
 .../gcc.target/arm/simd/mve-vcmp-f32-2.c  | 32 +++
 2 files changed, 40 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9377aaef342..35e22382650 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -31164,7 +31164,7 @@ arm_expand_vcond (rtx *operands, machine_mode 
cmp_result_mode)
 
   if (TARGET_HAVE_MVE)
 {
-  vcond_mve=true;
+  vcond_mve = true;
   mask = gen_reg_rtx (HImode);
 }
   else
@@ -31181,18 +31181,19 @@ arm_expand_vcond (rtx *operands, machine_mode 
cmp_result_mode)
 {
   machine_mode cmp_mode = GET_MODE (operands[4]);
   rtx vpr_p0 = mask;
-  rtx zero = gen_reg_rtx (cmp_mode);
-  rtx one = gen_reg_rtx (cmp_mode);
-  emit_move_insn (zero, CONST0_RTX (cmp_mode));
-  emit_move_insn (one, CONST1_RTX (cmp_mode));
+
   switch (GET_MODE_CLASS (cmp_mode))
{
case MODE_VECTOR_INT:
- emit_insn (gen_mve_vpselq (VPSELQ_S, cmp_result_mode, operands[0], 
one, zero, vpr_p0));
+ emit_insn (gen_mve_vpselq (VPSELQ_S, cmp_result_mode, operands[0],
+operands[1], operands[2], vpr_p0));
  break;
case MODE_VECTOR_FLOAT:
  if (TARGET_HAVE_MVE_FLOAT)
-   emit_insn (gen_mve_vpselq_f (cmp_mode, operands[0], one, zero, 
vpr_p0));
+   emit_insn (gen_mve_vpselq_f (cmp_mode, operands[0],
+operands[1], operands[2], vpr_p0));
+ else
+   gcc_unreachable ();
  break;
default:
  gcc_unreachable ();
diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c 
b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
new file mode 100644
index 000..917a95bf141
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
@@ -0,0 +1,32 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+
+#include 
+
+#define NB 4
+
+#define FUNC(OP, NAME) \
+  void test_ ## NAME ##_f (float * __restrict__ dest, float *a, float *b) { \
+int i; \
+for (i=0; i, vcmpgt)
+FUNC(>=, vcmpge)
+
+/* { dg-final { scan-assembler-times {\tvcmp.f32\teq, q[0-9]+, q[0-9]+\n} 1 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tne, q[0-9]+, q[0-9]+\n} 1 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tlt, q[0-9]+, q[0-9]+\n} 1 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tle, q[0-9]+, q[0-9]+\n} 1 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tgt, q[0-9]+, q[0-9]+\n} 1 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.f32\tge, q[0-9]+, q[0-9]+\n} 1 } 
} */
+/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 24 } } */ /* 
Constant 2.0f.  */
+/* { dg-final { scan-assembler-times {\t.word\t1077936128\n} 24 } } */ /* 
Constant 3.0f.  */
-- 
2.25.1



[Bug target/57890] gcc 4.8.1 regression: loops become slower

2021-06-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57890

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |7.0
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
  Known to fail||4.9.0, 6.1.0
  Known to work||4.7.0, 7.1.0

--- Comment #6 from Andrew Pinski  ---
Fixed in GCC 7.0:
f:
movdqa  xmm0, XMMWORD PTR .LC0[rip]
mov DWORD PTR c[rip+96], 808464432
movaps  XMMWORD PTR c[rip], xmm0
movaps  XMMWORD PTR c[rip+16], xmm0
movaps  XMMWORD PTR c[rip+32], xmm0
movaps  XMMWORD PTR c[rip+48], xmm0
movaps  XMMWORD PTR c[rip+64], xmm0
movaps  XMMWORD PTR c[rip+80], xmm0
ret
.LC0:
.quad   3472328296227680304
.quad   3472328296227680304

Where GCC 4.7.0 had produced (which is just as ok):
f:
movdqa  xmm0, XMMWORD PTR .LC0[rip]
mov BYTE PTR c[rip+96], 48
mov BYTE PTR c[rip+97], 48
movdqa  XMMWORD PTR c[rip], xmm0
mov BYTE PTR c[rip+98], 48
mov BYTE PTR c[rip+99], 48
movdqa  XMMWORD PTR c[rip+16], xmm0
movdqa  XMMWORD PTR c[rip+32], xmm0
movdqa  XMMWORD PTR c[rip+48], xmm0
movdqa  XMMWORD PTR c[rip+64], xmm0
movdqa  XMMWORD PTR c[rip+80], xmm0
ret
.LC0:
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48
.byte   48

Re: [PATCH] c++: explicit() ignored on deduction guide [PR100065]

2021-06-08 Thread Jason Merrill via Gcc-patches

On 6/7/21 8:06 PM, Marek Polacek wrote:

When we have explicit() with a value-dependent argument, we can't
evaluate it at parsing time, so cp_parser_function_specifier_opt stashes
the argument into the decl-specifiers and grokdeclarator then stores it
into explicit_specifier_map, which is then used when substituting the
function decl.  grokdeclarator stores it for constructors and conversion
functions, but we also need to do it for deduction guides, otherwise
we'll forget that we've seen an explicit-specifier as in the attached
test.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/branches?


OK.


PR c++/100065

gcc/cp/ChangeLog:

* decl.c (grokdeclarator): Store a value-dependent
explicit-specifier even for deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/explicit18.C: New test.
---
  gcc/cp/decl.c   |  2 ++
  gcc/testsuite/g++.dg/cpp2a/explicit18.C | 23 +++
  2 files changed, 25 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit18.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index a3687dbb0dd..cbf647dd569 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14043,6 +14043,8 @@ grokdeclarator (const cp_declarator *declarator,
storage_class = sc_none;
  }
  }
+   if (declspecs->explicit_specifier)
+ store_explicit_specifier (decl, declspecs->explicit_specifier);
}
  else
{
diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit18.C 
b/gcc/testsuite/g++.dg/cpp2a/explicit18.C
new file mode 100644
index 000..c8916fa4743
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/explicit18.C
@@ -0,0 +1,23 @@
+// PR c++/100065
+// { dg-do compile { target c++20 } }
+
+template
+struct bool_constant {
+  static constexpr bool value = B;
+  constexpr operator bool() const { return value; }
+};
+
+using true_type = bool_constant;
+using false_type = bool_constant;
+
+template
+struct X {
+template
+X(T);
+};
+
+template
+explicit(b) X(bool_constant) -> X;
+
+X false_ = false_type{}; // OK
+X true_  = true_type{};  // { dg-error "explicit deduction guide" }

base-commit: e89759fdfc80db223bd852aba937acb2d7c2cd80





[Bug tree-optimization/49203] missed-optimization: useless expressions not moved out of loop

2021-06-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49203

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
  Known to work||7.0
   Target Milestone|--- |7.0
  Known to fail||4.8.5

--- Comment #3 from Andrew Pinski  ---
Fixed in at least in GCC 7.0:

.L2:
leaq16(%r8), %rsi
movq%r8, %rdx
xorl%eax, %eax
.p2align 4,,10
.p2align 3
.L3:
movzbl  (%rdx), %ecx
sall$2, %eax
addq$1, %rdx
andl$3, %ecx
orl %ecx, %eax
cmpq%rsi, %rdx
jne .L3
movl%eax, %edx
addq$4, %r8
movb%ah, 2(%rdi)
shrl$24, %edx
movb%al, 3(%rdi)
addq$4, %rdi
movb%dl, -4(%rdi)
movl%eax, %edx
shrl$16, %edx
movb%dl, -3(%rdi)
cmpq%r8, %r9
jne .L2



   [94.12%]:
  # tmp_37 = PHI 
  # ivtmp.17_17 = PHI 
  _1 = tmp_37 << 2;
  _87 = (void *) ivtmp.17_17;
  _3 = MEM[base: _87, offset: 0B];
  _20 = _3 & 3;
  _4 = (unsigned int) _20;
  tmp_23 = _1 | _4;
  ivtmp.17_15 = ivtmp.17_17 + 1;
  if (ivtmp.17_15 != _83)
goto ; [93.75%]
  else
goto ; [6.25%]

   [5.88%]:
  _5 = tmp_23 >> 24;
  _6 = (unsigned char) _5;
  _76 = (void *) ivtmp.27_82;
  MEM[base: _76, offset: 0B] = _6;
  _7 = tmp_23 >> 16;
  _9 = (unsigned char) _7;
  MEM[base: _76, offset: 1B] = _9;
  _10 = tmp_23 >> 8;
  _12 = (unsigned char) _10;
  MEM[base: _76, offset: 2B] = _12;
  _14 = (unsigned char) tmp_23;
  MEM[base: _76, offset: 3B] = _14;
  ivtmp.27_81 = ivtmp.27_82 + 4;
  ivtmp.28_78 = ivtmp.28_79 + 4;
  if (_71 != ivtmp.28_78)
goto ; [87.51%]
  else
goto ; [12.49%]

   [5.88%]:
  # ivtmp.27_82 = PHI 
  # ivtmp.28_79 = PHI 
  _83 = ivtmp.28_79 + 16;
  goto ; [100.00%]

[Bug c++/91706] [9/10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in equate_type_number_to_die, at dwarf2out.c:5782

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91706

Jason Merrill  changed:

   What|Removed |Added

Summary|[9/10/11/12 Regression] |[9/10 Regression] ICE: tree
   |ICE: tree check: expected   |check: expected class
   |class 'type', have  |'type', have 'exceptional'
   |'exceptional' (error_mark)  |(error_mark) in
   |in  |equate_type_number_to_die,
   |equate_type_number_to_die,  |at dwarf2out.c:5782
   |at dwarf2out.c:5782 |

--- Comment #13 from Jason Merrill  ---
Fixed for 11.2/12 so far.  Is there interest in fixing this on the 9/10
branches?

[Bug c++/100752] [11/12 Regression] error: cannot call member function ‘void S::f()’ without object

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100752

--- Comment #4 from Jason Merrill  ---
As I mentioned on IRC, it seems like this may just be a matter of properly
passing down flags/member_p in the recursive call to cp_parser_declarator.

Re: [PATCH] For obj-c stage-final re-use the checksum from the previous stage

2021-06-08 Thread Jason Merrill via Gcc-patches
On Tue, Jun 8, 2021 at 5:05 PM Bernd Edlinger 
wrote:

> On 6/8/21 3:54 PM, Jason Merrill wrote:
> >
> > This breaks bootstrap2.
> >
> > Jason
> >
>
>
> Sorry for the breakage,
>
> I've committed the following as obvious after
> confirming that it fixes bootstrap2:
>

Thanks.

Jason


[Bug c++/100752] [11/12 Regression] error: cannot call member function ‘void S::f()’ without object

2021-06-08 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100752

--- Comment #3 from Marek Polacek  ---
Duh, we don't defer parsing of noexcept for any ptr-operator, like

struct S {
  int& f() noexcept(noexcept(i));
  int i;
};

Awkward, but the fix should be simple.

Re: [PATCH v2] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-08 Thread Segher Boessenkool
Hi!

On Tue, Jun 08, 2021 at 09:11:33AM +0800, Xionghu Luo wrote:
> On P8LE, extra rot64+rot64 load or store instructions are generated
> in float128 to vector __int128 conversion.
> 
> This patch teaches pass swaps to also handle such pattens to remove
> extra swap instructions.

> +/* Return 1 iff PAT is a rotate 64 bit expression; else return 0.  */
> +
> +static bool
> +pattern_is_rotate64_p (rtx pat)

You already have a verb in the name, don't use _p please (and preferably
just don't use it at all, "pattern_is_rotate64" is much better than
"pattern_rotate64_p").

> +{
> +  rtx rot = SET_SRC (pat);

So this is assuming PAT is a SINGLE_SET.  Please say that in the
function comment.

/* Return 1 iff PAT (a SINGLE_SET) is a rotate 64 bit expression; else
   return 0.  */

You can do an assert for that as well, but I wouldn't bother.

> @@ -266,6 +280,9 @@ insn_is_load_p (rtx insn)

(I do realise you just copied existing naming, don't worry :-) )

> @@ -392,7 +411,8 @@ quad_aligned_load_p (swap_web_entry *insn_entry, rtx_insn 
> *insn)
>   false.  */
>rtx body = PATTERN (def_insn);
>if (GET_CODE (body) != SET
> -  || GET_CODE (SET_SRC (body)) != VEC_SELECT
> +  || !(GET_CODE (SET_SRC (body)) == VEC_SELECT
> +   || pattern_is_rotate64_p (body))

Broken indentation: the || should align with "pattern...".

> @@ -2223,9 +2246,9 @@ static void
>  recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete)
>  {
>rtx body = PATTERN (insn);
> -  gcc_assert (GET_CODE (body) == SET
> -   && MEM_P (SET_DEST (body))
> -   && GET_CODE (SET_SRC (body)) == VEC_SELECT);
> +  gcc_assert (GET_CODE (body) == SET && MEM_P (SET_DEST (body))
> +   && (GET_CODE (SET_SRC (body)) == VEC_SELECT
> +   || pattern_is_rotate64_p (body)));

Please start a new line for every "&&" here.  The way it was was more
readable.

It often is nice to keep things one one line, if it fits on one line.
If it does not, make a new line for every phrase.  This is more readable
because you can then just scan down the line of "&&" and see the start
of every phrase without actually having to read it all.

> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-call.c 
> b/gcc/testsuite/gcc.target/powerpc/float128-call.c
> index 5895416e985..a1f09df8a57 100644
> --- a/gcc/testsuite/gcc.target/powerpc/float128-call.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-call.c
> @@ -21,5 +21,5 @@
>  TYPE one (void) { return ONE; }
>  void store (TYPE a, TYPE *p) { *p = a; }
>  
> -/* { dg-final { scan-assembler "lxvd2x 34"  } } */
> -/* { dg-final { scan-assembler "stxvd2x 34" } } */
> +/* { dg-final { scan-assembler "lvx 2"  } } */
> +/* { dg-final { scan-assembler "stvx 2" } } */

Huh.  Is that correct?  Where did the other 32 loads and stores go?  Are
there now other insns generated that you should scan for?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100085.c
> @@ -0,0 +1,24 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_float128_sw_ok } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */

If you use float128_ok you should use -mfloat128 (or this is very
surprising and is worth an explanation itself :-) )

But, you do not need it, since you use -mcpu=power8 already (which
implicitly sets this).  So just remove that dg-require please.

> +/* { dg-final { scan-assembler-not "xxpermdi" } } */
> +/* { dg-final { scan-assembler-not "stxvd2x" } } */
> +/* { dg-final { scan-assembler-not "lxvd2x" } } */

It is a good habit to use \m and \M in the scans where you can (those
are the same as \< and \> are in some other regexp dialects).  They
aren't absolutely necessary here of course.


Okay for trunk with those fixes.  Thanks!


Segher


Re: [PATCH] For obj-c stage-final re-use the checksum from the previous stage

2021-06-08 Thread Bernd Edlinger
On 6/8/21 3:54 PM, Jason Merrill wrote:
> 
> This breaks bootstrap2.
> 
> Jason
> 


Sorry for the breakage,

I've committed the following as obvious after
confirming that it fixes bootstrap2:

Subject: [PATCH] Fix bootstrap2 breakage due to re-use of obj-c checksum

gcc/objc:
2021-06-08  Bernd Edlinger  

* Make-lang.in (cc1-obj-checksum.c): Check previous
stage checksum exists.

gcc/objcp:
2021-06-08  Bernd Edlinger  

* Make-lang.in (cc1objplus-checksum.c): Check previous
stage checksum exists.
---
 gcc/objc/Make-lang.in  | 3 ++-
 gcc/objcp/Make-lang.in | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/objc/Make-lang.in b/gcc/objc/Make-lang.in
index 9011140..25fbd4c 100644
--- a/gcc/objc/Make-lang.in
+++ b/gcc/objc/Make-lang.in
@@ -63,7 +63,8 @@ objc_OBJS = $(OBJC_OBJS) cc1obj-checksum.o
 cc1obj-checksum.c : build/genchecksum$(build_exeext) checksum-options \
 $(OBJC_OBJS) $(C_AND_OBJC_OBJS) $(BACKEND) $(LIBDEPS)
if [ -f ../stage_final ] \
-  && cmp -s ../stage_current ../stage_final; then \
+  && cmp -s ../stage_current ../stage_final \
+  && [ -f ../prev-gcc/$@ ]; then \
  cp ../prev-gcc/$@ $@; \
else \
  build/genchecksum$(build_exeext) $(OBJC_OBJS) $(C_AND_OBJC_OBJS) \
diff --git a/gcc/objcp/Make-lang.in b/gcc/objcp/Make-lang.in
index 3ecc50b..2e27be5 100644
--- a/gcc/objcp/Make-lang.in
+++ b/gcc/objcp/Make-lang.in
@@ -66,7 +66,8 @@ obj-c++_OBJS = $(OBJCXX_OBJS) cc1objplus-checksum.o
 cc1objplus-checksum.c : build/genchecksum$(build_exeext) checksum-options \
$(OBJCXX_OBJS) $(BACKEND) $(CODYLIB) $(LIBDEPS)
if [ -f ../stage_final ] \
-  && cmp -s ../stage_current ../stage_final; then \
+  && cmp -s ../stage_current ../stage_final \
+  && [ -f ../prev-gcc/$@ ]; then \
  cp ../prev-gcc/$@ $@; \
else \
  build/genchecksum$(build_exeext) $(OBJCXX_OBJS) $(BACKEND) \
-- 
1.9.1


Thanks
Bernd.


[Bug c++/100976] [C++23] Make constexpr reference temp constexpr

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100976

--- Comment #2 from Jason Merrill  ---
Or rather,

int main()
{
  constexpr const int  = 42;
  static_assert(r == 42); // { dg-bogus "" }
}

[expr.const]/4.7 says that "a temporary object of non-volatile const-qualified
literal type whose lifetime is extended to that
of a variable that is usable in constant expressions" is usable in a constant
expression.

Re: [PATCH 02/57] Support scanning of build-time GC roots in gengtype

2021-06-08 Thread Bill Schmidt via Gcc-patches

On 6/7/21 12:48 PM, Bill Schmidt wrote:

On 6/7/21 12:45 PM, Richard Biener wrote:
On Mon, Jun 7, 2021 at 5:38 PM Bill Schmidt  
wrote:

On 6/7/21 8:36 AM, Richard Biener wrote:

Some maybe obvious issue - what about DOS-style path hosts?
You seem to build ../ strings to point to parent dirs... I'm not sure
what we do elsewhere - I suppose we arrange for appropriate
-I command line arguments?


Well, actually it's just using "./" to identify the build directory,
though I see what you mean about potential Linux bias. There is
precedent for this syntax identifying the build directory in config.gcc
for target macro files:

#  tm_file  A list of target macro files, if different from
#   "$cpu_type/$cpu_type.h". Usually it's 
constructed

#   per target in a way like this:
#   tm_file="${tm_file} dbxelf.h elfos.h
${cpu_type.h}/elf.h"
#   Note that the preferred order is:
#   - specific target header
"${cpu_type}/${cpu_type.h}"
#   - generic headers like dbxelf.h elfos.h, etc.
#   - specializing target headers like
${cpu_type.h}/elf.h
#   This helps to keep OS specific stuff out of 
the CPU

#   defining header ${cpu_type}/${cpu_type.h}.
#
#   It is possible to include 
automatically-generated
#   build-directory files by prefixing them with 
"./".
#   All other files should relative to 
$srcdir/config.


...so I thought I would try to be consistent with this change. In patch
0025 I use this as follows:

--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -491,6 +491,7 @@ powerpc*-*-*)
  extra_options="${extra_options} g.opt fused-madd.opt
rs6000/rs6000-tables.opt"
  target_gtfiles="$target_gtfiles
\$(srcdir)/config/rs6000/rs6000-logue.c
\$(srcdir)/config/rs6000/rs6000-call.c"
  target_gtfiles="$target_gtfiles
\$(srcdir)/config/rs6000/rs6000-pcrel-opt.c"
+   target_gtfiles="$target_gtfiles ./rs6000-builtins.h"
;;
   pru-*-*)
cpu_type=pru

I'm open to trying to do something different if you think that's
appropriate.

Well, I'm not sure whether/how to resolve this.  You could try
building a cross to powerpc-linux from a x86_64-mingw host ...
maybe there's one on the CF?  Or some of your fellow RedHat
people have access to mingw or the like envs to try whether it
just works with your change ...

Otherwise it looks OK.


I'll see what I can find.  Thanks again for reviewing the patch!



Hm.  Ultimately, I think the cross compiler case is doomed unless mingw 
already handles converting forward slashes to back slashes. There's no 
single syntax that works on both Windows and Linux. (There's no mingw 
server in the compile farm to play with.)


I'm inclined to accept both "./" and ".\" for native builds, and kick 
the can down the road beyond that.  What do you think?


Bill



Bill




Richard.


Thanks for your help with this!

Bill



[Bug target/100973] gcc does not optimise based on knowing that `_mm256_movemask_ps` returns less than 255

2021-06-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100973

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64-linux-gnu
   Keywords||missed-optimization
   Last reconfirmed||2021-06-08
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
This is a target/tree-optimization.  Basically Tree level optimization has no
idea what the builtin does and there is no target hook to querry the back-end
for ranges:
  _3 = __builtin_ia32_movmskps256D.2066 (values_2(D));

Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-08 Thread Segher Boessenkool
Hi!

On Fri, Jun 04, 2021 at 09:40:58AM +0800, Xionghu Luo wrote:
> >> Combine still fail to merge the two instructions:
> >>
> >> Trying 6 -> 7:
> >>  6: r120:KF#0=r125:KF#0<-<0x40
> >>REG_DEAD r125:KF
> >>  7: [sfp:DI+r123:DI]=r120:KF#0<-<0x40
> >>REG_DEAD r120:KF
> >> Successfully matched this instruction:
> >> (set (mem/c:V1TI (plus:DI (reg/f:DI 110 sfp)
> >>  (reg:DI 123)) [1  S16 A128])
> >>  (subreg:V1TI (reg:KF 125) 0))
> >> rejecting combination of insns 6 and 7
> >> original costs 4 + 4 = 8
> >> replacement cost 12
> > 
> > So what instructions were these?  Why did the store cost 4 but the new
> > one costs 12?

The *vsx_le_perm_store_ instruction has the *preferred*
alternative with cost 12, while the other alternative has cost 8.  Why
is that?  That looks like a bug.
   (set_attr "length" "12,8")

> >> By hacking the vsx_le_perm_store_v1ti INSN_COST from 12 to 8,
> > 
> > It should be the same cost as the other store!
> 
> vsx_le_permute_v1ti's cost is defined to 4 in vsx.md:

Yes.  Why is alternative 0 of *vsx_le_perm_store_ said to have a
length of 3 insns?


Segher


[Bug testsuite/100407] New test cases attr-retain-*.c fail after their introduction in r11-7284

2021-06-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100407

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #12 from Segher Boessenkool  ---
(In reply to H.J. Lu from comment #10)
> > unused_rodata:
> > .section.sdata.used_rodata,"awR"

This is symbol *un*used_rodata.

> used_rodata is in a writable section.  Is this intentional? -m64 generates

Does -mno-readonly-in-sdata help?  Does -msdata=none help?

[Bug fortran/100950] ICE in output_constructor_regular_field, at varasm.c:5514

2021-06-08 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100950

--- Comment #8 from anlauf at gcc dot gnu.org ---
Created attachment 50967
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50967=edit
Tentativ fix

This patch would fix the testcase.  It is inspired by code in primary.c,
match_string_constant.  Not regtested.

[Bug analyzer/99212] [11 Regression] gcc.dg/analyzer/data-model-1.c line 971

2021-06-08 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99212

David Malcolm  changed:

   What|Removed |Added

Summary|[11/12 Regression]  |[11 Regression]
   |gcc.dg/analyzer/data-model- |gcc.dg/analyzer/data-model-
   |1.c line 971|1.c line 971

--- Comment #16 from David Malcolm  ---
Should be fixed on trunk (for gcc 12) by the above commit

[Bug libstdc++/100940] views::take and views::drop should not define _S_has_simple_extra_args

2021-06-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100940

Patrick Palka  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[PATCH v2 2/2] rs6000: Add test for _mm_minpos_epu16

2021-06-08 Thread Paul A. Clarke via Gcc-patches
Copy the test for _mm_minpos_epu16 from
gcc/testsuite/gcc.target/i386/sse4_1-phminposuw.c, with
a few adjustments:

- Adjust the dejagnu directives for powerpc platform.
- Make the data not be monotonically increasing,
  such that some of the returned values are not
  always the first value (index 0).
- Create a list of input data testing various scenarios
  including more than one minimum value and different
  orders and indicies of the minimum value.
- Fix a masking issue where the index was being truncated
  to 2 bits instead of 3 bits, which wasn't found because
  all of the returned indicies were 0 with the original
  generated data.
- Support big-endian.

2021-06-08  Paul A. Clarke  

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/sse4_1-phminposuw.c: Copy from
gcc/testsuite/gcc.target/i386, make more robust.
---
 .../gcc.target/powerpc/sse4_1-phminposuw.c| 68 +++
 1 file changed, 68 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c

diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c 
b/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c
new file mode 100644
index ..3bb5a2dfe4f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c
@@ -0,0 +1,68 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#define NO_WARN_X86_INTRINSICS 1
+#ifndef CHECK_H
+#define CHECK_H "sse4_1-check.h"
+#endif
+
+#ifndef TEST
+#define TEST sse4_1_test
+#endif
+
+#include CHECK_H
+
+#include 
+
+#define DIM(a) (sizeof (a) / sizeof ((a)[0]))
+
+static void
+TEST (void)
+{
+  union
+{
+  __m128i x;
+  unsigned short s[8];
+} src[] =
+{
+  { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x 
} },
+  { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x 
} },
+  { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x 
} },
+  { .s = { 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008 
} },
+  { .s = { 0x0008, 0x0007, 0x0006, 0x0005, 0x0004, 0x0003, 0x0002, 0x0001 
} },
+  { .s = { 0xfff4, 0xfff3, 0xfff2, 0xfff1, 0xfff3, 0xfff1, 0xfff2, 0xfff3 
} }
+};
+  unsigned short minVal[DIM (src)];
+  int minInd[DIM (src)];
+  unsigned short minValScalar, minIndScalar;
+  int i, j;
+  union
+{
+  int i;
+  unsigned short s[2];
+} res;
+
+  for (i = 0; i < DIM (src); i++)
+{
+  res.i = _mm_cvtsi128_si32 (_mm_minpos_epu16 (src[i].x));
+  minVal[i] = res.s[0];
+  minInd[i] = res.s[1] & 0b111;
+}
+
+  for (i = 0; i < DIM (src); i++)
+{
+  minValScalar = src[i].s[0];
+  minIndScalar = 0;
+
+  for (j = 1; j < 8; j++)
+   if (minValScalar > src[i].s[j])
+ {
+   minValScalar = src[i].s[j];
+   minIndScalar = j;
+ }
+
+  if (minValScalar != minVal[i] && minIndScalar != minInd[i])
+   abort ();
+}
+}
-- 
2.27.0



[PATCH v2 1/2] rs6000: Add support for _mm_minpos_epu16

2021-06-08 Thread Paul A. Clarke via Gcc-patches
Add a naive implementation of the subject x86 intrinsic to
ease porting.

2021-06-08  Paul A. Clarke  

gcc/ChangeLog:
* config/rs6000/smmintrin.h (_mm_minpos_epu16): New.
---
 gcc/config/rs6000/smmintrin.h | 25 +
 1 file changed, 25 insertions(+)

diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index bdf6eb365d88..b7de38763f2b 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -116,4 +116,29 @@ _mm_blendv_epi8 (__m128i __A, __m128i __B, __m128i __mask)
   return (__m128i) vec_sel ((__v16qu) __A, (__v16qu) __B, __lmask);
 }
 
+/* Return horizontal packed word minimum and its index in bits [15:0]
+   and bits [18:16] respectively.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_minpos_epu16 (__m128i __A)
+{
+  union __u
+{
+  __m128i __m;
+  __v8hu __uh;
+};
+  union __u __u = { .__m = __A }, __r = { .__m = {0} };
+  unsigned short __ridx = 0;
+  unsigned short __rmin = __u.__uh[__ridx];
+  for (unsigned long __i = __ridx + 1; __i < 8; __i++)
+{
+  if (__u.__uh[__i] < __rmin)
+{
+  __rmin = __u.__uh[__i];
+  __ridx = __i;
+}
+}
+  __r.__uh[0] = __rmin;
+  __r.__uh[1] = __ridx;
+  return __r.__m;
+}
 #endif
-- 
2.27.0



[PATCH v2 0/2] rs6000: Add support for _mm_minpos_epu16

2021-06-08 Thread Paul A. Clarke via Gcc-patches
Added compatible implementation of _mm_minpos_epu16 for powerpc.
Copied, improved, and fixed testcase from i386.
Tested on BE, LE (32 and 64bit).

Paul A. Clarke (2):
  rs6000: Add support for _mm_minpos_epu16
  rs6000: Add test for _mm_minpos_epu16

 gcc/config/rs6000/smmintrin.h | 25 +++
 .../gcc.target/powerpc/sse4_1-phminposuw.c| 68 +++
 2 files changed, 93 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c

-- 
2.27.0



[PATCH] PR middle-end/53267: Constant fold BUILT_IN_FMOD.

2021-06-08 Thread Roger Sayle

Here's a three line patch to implement constant folding for fmod,
fmodf and fmodl, which resolves an enhancement request from 2012.

The following patch has been tested on x86_64-pc-linux-gnu with
a make bootstrap and make -k check with no new failures.

Ok for mainline?


2020-06-08  Roger Sayle  

gcc/ChangeLog
PR middle-end/53267
* fold-const-call.c (fold_const_call_sss) [CASE_CFN_FMOD]:
Support evaluation of fmod/fmodf/fmodl at compile-time.

gcc/testsuite/ChangeLog
* gcc.dg/builtins-70.c: New test.


Roger
--
Roger Sayle
NextMove Software
Cambridge, UK

diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index a1d70b6..d6cb9b1 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -1375,6 +1375,9 @@ fold_const_call_sss (real_value *result, combined_fn fn,
 CASE_CFN_FDIM:
   return do_mpfr_arg2 (result, mpfr_dim, arg0, arg1, format);
 
+CASE_CFN_FMOD:
+  return do_mpfr_arg2 (result, mpfr_fmod, arg0, arg1, format);
+
 CASE_CFN_HYPOT:
   return do_mpfr_arg2 (result, mpfr_hypot, arg0, arg1, format);
 
/* Copyright (C) 2021 Free Software Foundation.

   Check that constant folding of built-in fmod functions doesn't
   break anything and produces the expected results.

/* { dg-do link } */
/* { dg-options "-O2 -ffast-math" } */

extern void link_error(void);

extern double fmod(double,double);
extern float fmodf(float,float);
extern long double fmodl(long double,long double);

int main()
{
  if (fmod (6.5, 2.3) < 1.8999 || fmod (6.5, 2.3) > 1.9001)
link_error ();
  if (fmod (-6.5, 2.3) < -1.9001 || fmod (-6.5, 2.3) > -1.8999)
link_error ();
  if (fmod (6.5, -2.3) < 1.8999 || fmod (6.5, -2.3) > 1.9001)
link_error ();
  if (fmod (-6.5, -2.3) < -1.9001 || fmod (-6.5, -2.3) > -1.8999)
link_error ();

  if (fmodf (6.5f, 2.3f) < 1.8999f || fmodf (6.5f, 2.3f) > 1.9001f)
link_error ();
  if (fmodf (-6.5f, 2.3f) < -1.9001f || fmodf (-6.5f, 2.3f) > -1.8999f)
link_error ();
  if (fmodf (6.5f, -2.3f) < 1.8999f || fmodf (6.5f, -2.3f) > 1.9001f)
link_error ();
  if (fmodf (-6.5f, -2.3f) < -1.9001f || fmodf (-6.5f, -2.3f) > -1.8999f)
link_error ();

  if (fmodl (6.5l, 2.3l) < 1.8999l || fmod (6.5l, 2.3l) > 1.9001l)
link_error ();
  if (fmodl (-6.5l, 2.3l) < -1.9001l || fmod (-6.5l, 2.3l) > -1.8999l)
link_error ();
  if (fmodl (6.5l, -2.3l) < 1.8999l || fmod (6.5l, -2.3l) > 1.9001l)
link_error ();
  if (fmodl (-6.5l, -2.3l) < -1.9001l || fmod (-6.5l, -2.3l) > -1.8999l)
link_error ();

  return 0;
}



[Bug libstdc++/100940] views::take and views::drop should not define _S_has_simple_extra_args

2021-06-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100940

--- Comment #5 from Patrick Palka  ---
(In reply to TC from comment #4)
> (In reply to Patrick Palka from comment #3)
> > Good point, confirmed.  Though I'm not sure if perfect forwarding here is
> > strictly necessary to fix this testcase.  Perhaps the
> > _S_has_simple_extra_args versions of _Partial should be forwarding the bound
> > arguments as prvalues instead of as const lvalues?
> 
> It's pretty easy to come up with counterexamples that don't work (for
> example, the type might be move-only).
> 
> It may be better to limit the "simple" case for take/drop to when the
> argument type is integer-like; that's like 99% of uses anyway. Contrived
> examples gets the perfect forwarding fun but that's fine.
> 
> Similarly, it might be a good idea to restrict the "simple" case for the
> other adaptors a bit - perhaps to the case where the predicate is trivially
> copyable, which should still give good diagnostic for a lot of uses, but
> avoids a performance hit if the function object at issue is
> like...std::function.

That makes sense to me.  Implementation wise I guess this would mean
parameterizing the _S_has_simple_extra_args flag by the actual types of the
extra arguments.   And I suppose we could also use this to declare some partial
applications of split to be simple, e.g. when the pattern argument is a scalar
or a view, and get good diagnostics for split in these cases.

[committed] analyzer: bitfield fixes [PR99212]

2021-06-08 Thread David Malcolm via Gcc-patches
This patch verifies the previous fix for bitfield sizes by implementing
enough support for bitfields in the analyzer to get the test cases to pass.

The patch implements support in the analyzer for reading from a
BIT_FIELD_REF, and support for folding BIT_AND_EXPR of a mask, to handle
the cases generated in tests.

The existing bitfields tests in data-model-1.c turned out to rely on
undefined behavior, in that they were assigning values to a signed
bitfield that were outside of the valid range of values.  I believe that
that's why we were seeing target-specific differences in the test
results (PR analyzer/99212).  The patch updates the test to remove the
undefined behaviors.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Lightly tested with cris-elf.

Pushed to trunk as r12-1303-gd3b1ef7a83c0c0cd5b20a1dd1714b868f3d2b442.

gcc/analyzer/ChangeLog:
PR analyzer/99212
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Add support for folding
BIT_AND_EXPR of compound_svalue and a mask constant.
* region-model.cc (region_model::get_rvalue_1): Implement
BIT_FIELD_REF in terms of...
(region_model::get_rvalue_for_bits): New function.
* region-model.h (region_model::get_rvalue_for_bits): New decl.
* store.cc (bit_range::from_mask): New function.
(selftest::test_bit_range_intersects_p): New selftest.
(selftest::assert_bit_range_from_mask_eq): New.
(ASSERT_BIT_RANGE_FROM_MASK_EQ): New macro.
(selftest::assert_no_bit_range_from_mask_eq): New.
(ASSERT_NO_BIT_RANGE_FROM_MASK): New macro.
(selftest::test_bit_range_from_mask): New selftest.
(selftest::analyzer_store_cc_tests): Call the new selftests.
* store.h (bit_range::intersects_p): New.
(bit_range::from_mask): New decl.
(concrete_binding::get_bit_range): New accessor.
(store_manager::get_concrete_binding): New overload taking
const bit_range &.

gcc/testsuite/ChangeLog:
PR analyzer/99212
* gcc.dg/analyzer/bitfields-1.c: New test.
* gcc.dg/analyzer/data-model-1.c (struct sbits): Make bitfields
explicitly signed.
(test_44): Update test values assigned to the bits to ones that
fit in the range of the bitfield type.  Remove xfails.
(test_45): Remove xfails.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-manager.cc |  46 -
 gcc/analyzer/region-model.cc |  65 ++-
 gcc/analyzer/region-model.h  |   4 +
 gcc/analyzer/store.cc| 186 +++
 gcc/analyzer/store.h |  18 ++
 gcc/testsuite/gcc.dg/analyzer/bitfields-1.c  | 144 ++
 gcc/testsuite/gcc.dg/analyzer/data-model-1.c |  30 +--
 7 files changed, 469 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/bitfields-1.c

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index dfd2413e914..0ca0c8ad02e 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -480,9 +480,49 @@ region_model_manager::maybe_fold_binop (tree type, enum 
tree_code op,
   break;
 case BIT_AND_EXPR:
   if (cst1)
-   if (zerop (cst1) && INTEGRAL_TYPE_P (type))
- /* "(ARG0 & 0)" -> "0".  */
- return get_or_create_constant_svalue (build_int_cst (type, 0));
+   {
+ if (zerop (cst1) && INTEGRAL_TYPE_P (type))
+   /* "(ARG0 & 0)" -> "0".  */
+   return get_or_create_constant_svalue (build_int_cst (type, 0));
+
+ /* Support masking out bits from a compound_svalue, as this
+is generated when accessing bitfields.  */
+ if (const compound_svalue *compound_sval
+   = arg0->dyn_cast_compound_svalue ())
+   {
+ const binding_map  = compound_sval->get_map ();
+ unsigned HOST_WIDE_INT mask = TREE_INT_CST_LOW (cst1);
+ /* If "mask" is a contiguous range of set bits, see if the
+compound_sval has a value for those bits.  */
+ bit_range bits (0, 0);
+ if (bit_range::from_mask (mask, ))
+   {
+ const concrete_binding *conc
+   = get_store_manager ()->get_concrete_binding (bits,
+ BK_direct);
+ if (const svalue *sval = map.get (conc))
+   {
+ /* We have a value;
+shift it by the correct number of bits.  */
+ const svalue *lhs = get_or_create_cast (type, sval);
+ HOST_WIDE_INT bit_offset
+   = bits.get_start_bit_offset ().to_shwi ();
+ tree shift_amt = build_int_cst (type, bit_offset);
+ const svalue *shift_sval
+ 

[committed] analyzer: fix region::get_bit_size for bitfields

2021-06-08 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as c957d38044d7eb6a45f57a8a9f707c3c0a798e9f.

gcc/analyzer/ChangeLog:
* analyzer.h (int_size_in_bits): New decl.
* region.cc (int_size_in_bits): New function.
(region::get_bit_size): Reimplement in terms of the above.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/analyzer.h |  2 ++
 gcc/analyzer/region.cc  | 33 +
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index fb568e44d38..525eb06c3b5 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -144,6 +144,8 @@ typedef offset_int bit_offset_t;
 typedef offset_int bit_size_t;
 typedef offset_int byte_size_t;
 
+extern bool int_size_in_bits (const_tree type, bit_size_t *out);
+
 /* The location of a region expressesd as an offset relative to a
base region.  */
 
diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index 6db1fc91afd..5f246df7dfb 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -208,6 +208,29 @@ region::get_byte_size (byte_size_t *out) const
   return true;
 }
 
+/* If the size of TYPE (in bits) is constant, write it to *OUT
+   and return true.
+   Otherwise return false.  */
+
+bool
+int_size_in_bits (const_tree type, bit_size_t *out)
+{
+  if (INTEGRAL_TYPE_P (type))
+{
+  *out = TYPE_PRECISION (type);
+  return true;
+}
+
+  tree sz = TYPE_SIZE (type);
+  if (sz && tree_fits_uhwi_p (sz))
+{
+  *out = TREE_INT_CST_LOW (sz);
+  return true;
+}
+  else
+return false;
+}
+
 /* If the size of this region (in bits) is known statically, write it to *OUT
and return true.
Otherwise return false.  */
@@ -215,11 +238,13 @@ region::get_byte_size (byte_size_t *out) const
 bool
 region::get_bit_size (bit_size_t *out) const
 {
-  byte_size_t byte_size;
-  if (!get_byte_size (_size))
+  tree type = get_type ();
+
+  /* Bail out e.g. for heap-allocated regions.  */
+  if (!type)
 return false;
-  *out = byte_size * BITS_PER_UNIT;
-  return true;
+
+  return int_size_in_bits (type, out);
 }
 
 /* Get the field within RECORD_TYPE at BIT_OFFSET.  */
-- 
2.26.3



[committed] analyzer: split out struct bit_range from class concrete_binding

2021-06-08 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as 6b400aef1bdc84bbdf5011caff3fe5f82c68d253.

gcc/analyzer/ChangeLog:
* store.cc (concrete_binding::dump_to_pp): Move bulk of
implementation to...
(bit_range::dump_to_pp): ...this new function.
(bit_range::cmp): New.
(concrete_binding::overlaps_p): Update for use of bit_range.
(concrete_binding::cmp_ptr_ptr): Likewise.
* store.h (struct bit_range): New.
(class concrete_binding): Replace fields m_start_bit_offset and
m_size_in_bits with new field m_bit_range.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/store.cc | 38 +++
 gcc/analyzer/store.h  | 61 +++
 2 files changed, 77 insertions(+), 22 deletions(-)

diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index b1874a5a2d3..f4bb7def781 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -236,15 +236,12 @@ binding_key::cmp (const binding_key *k1, const 
binding_key *k2)
 }
 }
 
-/* class concrete_binding : public binding_key.  */
-
-/* Implementation of binding_key::dump_to_pp vfunc for concrete_binding.  */
+/* struct struct bit_range.  */
 
 void
-concrete_binding::dump_to_pp (pretty_printer *pp, bool simple) const
+bit_range::dump_to_pp (pretty_printer *pp) const
 {
-  binding_key::dump_to_pp (pp, simple);
-  pp_string (pp, ", start: ");
+  pp_string (pp, "start: ");
   pp_wide_int (pp, m_start_bit_offset, SIGNED);
   pp_string (pp, ", size: ");
   pp_wide_int (pp, m_size_in_bits, SIGNED);
@@ -252,12 +249,34 @@ concrete_binding::dump_to_pp (pretty_printer *pp, bool 
simple) const
   pp_wide_int (pp, get_next_bit_offset (), SIGNED);
 }
 
+int
+bit_range::cmp (const bit_range , const bit_range )
+{
+  if (int start_cmp = wi::cmps (br1.m_start_bit_offset,
+   br2.m_start_bit_offset))
+return start_cmp;
+
+  return wi::cmpu (br1.m_size_in_bits, br2.m_size_in_bits);
+}
+
+/* class concrete_binding : public binding_key.  */
+
+/* Implementation of binding_key::dump_to_pp vfunc for concrete_binding.  */
+
+void
+concrete_binding::dump_to_pp (pretty_printer *pp, bool simple) const
+{
+  binding_key::dump_to_pp (pp, simple);
+  pp_string (pp, ", ");
+  m_bit_range.dump_to_pp (pp);
+}
+
 /* Return true if this binding overlaps with OTHER.  */
 
 bool
 concrete_binding::overlaps_p (const concrete_binding ) const
 {
-  if (m_start_bit_offset < other.get_next_bit_offset ()
+  if (get_start_bit_offset () < other.get_next_bit_offset ()
   && get_next_bit_offset () > other.get_start_bit_offset ())
 return true;
   return false;
@@ -274,10 +293,7 @@ concrete_binding::cmp_ptr_ptr (const void *p1, const void 
*p2)
   if (int kind_cmp = b1->get_kind () - b2->get_kind ())
 return kind_cmp;
 
-  if (int start_cmp = wi::cmps (b1->m_start_bit_offset, 
b2->m_start_bit_offset))
-return start_cmp;
-
-  return wi::cmpu (b1->m_size_in_bits, b2->m_size_in_bits);
+  return bit_range::cmp (b1->m_bit_range, b2->m_bit_range);
 }
 
 /* class symbolic_binding : public binding_key.  */
diff --git a/gcc/analyzer/store.h b/gcc/analyzer/store.h
index d68513ca94c..be09b427366 100644
--- a/gcc/analyzer/store.h
+++ b/gcc/analyzer/store.h
@@ -267,6 +267,42 @@ private:
   enum binding_kind m_kind;
 };
 
+struct bit_range
+{
+  bit_range (bit_offset_t start_bit_offset, bit_size_t size_in_bits)
+  : m_start_bit_offset (start_bit_offset),
+m_size_in_bits (size_in_bits)
+  {}
+
+  void dump_to_pp (pretty_printer *pp) const;
+
+  bit_offset_t get_start_bit_offset () const
+  {
+return m_start_bit_offset;
+  }
+  bit_offset_t get_next_bit_offset () const
+  {
+return m_start_bit_offset + m_size_in_bits;
+  }
+
+  bool contains_p (bit_offset_t offset) const
+  {
+return (offset >= get_start_bit_offset ()
+   && offset < get_next_bit_offset ());
+  }
+
+  bool operator== (const bit_range ) const
+  {
+return (m_start_bit_offset == other.m_start_bit_offset
+   && m_size_in_bits == other.m_size_in_bits);
+  }
+
+  static int cmp (const bit_range , const bit_range );
+
+  bit_offset_t m_start_bit_offset;
+  bit_size_t m_size_in_bits;
+};
+
 /* Concrete subclass of binding_key, for describing a concrete range of
bits within the binding_map (e.g. "bits 8-15").  */
 
@@ -279,24 +315,22 @@ public:
   concrete_binding (bit_offset_t start_bit_offset, bit_size_t size_in_bits,
enum binding_kind kind)
   : binding_key (kind),
-m_start_bit_offset (start_bit_offset),
-m_size_in_bits (size_in_bits)
+m_bit_range (start_bit_offset, size_in_bits)
   {}
   bool concrete_p () const FINAL OVERRIDE { return true; }
 
   hashval_t hash () const
   {
 inchash::hash hstate;
-hstate.add_wide_int (m_start_bit_offset);
-hstate.add_wide_int (m_size_in_bits);
+hstate.add_wide_int (m_bit_range.m_start_bit_offset);
+hstate.add_wide_int 

[committed] analyzer: remove redundant typedef

2021-06-08 Thread David Malcolm via Gcc-patches
Delete an overzealous copy

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as 8c5a5404cb68e5e39e296849944019b93a591646.

gcc/analyzer/ChangeLog:
* svalue.h (conjured_svalue::iterator_t): Delete.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/svalue.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/analyzer/svalue.h b/gcc/analyzer/svalue.h
index 7fe0ba3a603..d9e34aa6b89 100644
--- a/gcc/analyzer/svalue.h
+++ b/gcc/analyzer/svalue.h
@@ -1073,8 +1073,6 @@ namespace ana {
 class conjured_svalue : public svalue
 {
 public:
-  typedef binding_map::iterator_t iterator_t;
-
   /* A support class for uniquifying instances of conjured_svalue.  */
   struct key_t
   {
-- 
2.26.3



[Bug analyzer/99212] [11/12 Regression] gcc.dg/analyzer/data-model-1.c line 971

2021-06-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99212

--- Comment #15 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:d3b1ef7a83c0c0cd5b20a1dd1714b868f3d2b442

commit r12-1303-gd3b1ef7a83c0c0cd5b20a1dd1714b868f3d2b442
Author: David Malcolm 
Date:   Tue Jun 8 14:45:57 2021 -0400

analyzer: bitfield fixes [PR99212]

This patch verifies the previous fix for bitfield sizes by implementing
enough support for bitfields in the analyzer to get the test cases to pass.

The patch implements support in the analyzer for reading from a
BIT_FIELD_REF, and support for folding BIT_AND_EXPR of a mask, to handle
the cases generated in tests.

The existing bitfields tests in data-model-1.c turned out to rely on
undefined behavior, in that they were assigning values to a signed
bitfield that were outside of the valid range of values.  I believe that
that's why we were seeing target-specific differences in the test
results (PR analyzer/99212).  The patch updates the test to remove the
undefined behaviors.

gcc/analyzer/ChangeLog:
PR analyzer/99212
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Add support for folding
BIT_AND_EXPR of compound_svalue and a mask constant.
* region-model.cc (region_model::get_rvalue_1): Implement
BIT_FIELD_REF in terms of...
(region_model::get_rvalue_for_bits): New function.
* region-model.h (region_model::get_rvalue_for_bits): New decl.
* store.cc (bit_range::from_mask): New function.
(selftest::test_bit_range_intersects_p): New selftest.
(selftest::assert_bit_range_from_mask_eq): New.
(ASSERT_BIT_RANGE_FROM_MASK_EQ): New macro.
(selftest::assert_no_bit_range_from_mask_eq): New.
(ASSERT_NO_BIT_RANGE_FROM_MASK): New macro.
(selftest::test_bit_range_from_mask): New selftest.
(selftest::analyzer_store_cc_tests): Call the new selftests.
* store.h (bit_range::intersects_p): New.
(bit_range::from_mask): New decl.
(concrete_binding::get_bit_range): New accessor.
(store_manager::get_concrete_binding): New overload taking
const bit_range &.

gcc/testsuite/ChangeLog:
PR analyzer/99212
* gcc.dg/analyzer/bitfields-1.c: New test.
* gcc.dg/analyzer/data-model-1.c (struct sbits): Make bitfields
explicitly signed.
(test_44): Update test values assigned to the bits to ones that
fit in the range of the bitfield type.  Remove xfails.
(test_45): Remove xfails.

Signed-off-by: David Malcolm 

[Bug c++/100976] [C++23] Make constexpr reference temp constexpr

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100976

--- Comment #1 from Jason Merrill  ---
  constexpr const int  = 42;
  static_assert(r == 42);

[Bug c++/100975] [C++23] Allow pointer to array of auto

2021-06-08 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100975

--- Comment #1 from Jason Merrill  ---
  int a[3];
  auto (*p)[3] = 

[Bug rtl-optimization/100978] New: [10/11/12 Regression] ICE: qsort checking failed: qsort comparator non-negative on sorted output: 1 with -O3 -frename-registers -fno-sched-critical-path-heuristic -f

2021-06-08 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100978

Bug ID: 100978
   Summary: [10/11/12 Regression] ICE: qsort checking failed:
qsort comparator non-negative on sorted output: 1 with
-O3 -frename-registers
-fno-sched-critical-path-heuristic
-fsched2-use-superblocks
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu

Created attachment 50966
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50966=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O3 -frename-registers
-fno-sched-critical-path-heuristic -fsched2-use-superblocks testcase.c 
testcase.c: In function 'foo':
testcase.c:21:1: error: qsort comparator non-negative on sorted output: 1
   21 | }
  | ^
during RTL pass: sched2
testcase.c:21:1: internal compiler error: qsort checking failed
0xa1bf87 qsort_chk_error
/repo/gcc-trunk/gcc/vec.c:214
0xa1c093 qsort_chk(void*, unsigned long, unsigned long, int (*)(void const*,
void const*, void*), void*)
/repo/gcc-trunk/gcc/vec.c:256
0x1d79fb5 gcc_qsort(void*, unsigned long, unsigned long, int (*)(void const*,
void const*))
/repo/gcc-trunk/gcc/sort.cc:270
0x1bd83c0 ready_sort_real
/repo/gcc-trunk/gcc/haifa-sched.c:3095
0x1be09c5 ready_sort
/repo/gcc-trunk/gcc/haifa-sched.c:3111
0x1be09c5 schedule_block(basic_block_def**, void*)
/repo/gcc-trunk/gcc/haifa-sched.c:6709
0x1cb32ab schedule_ebb(rtx_insn*, rtx_insn*, bool)
/repo/gcc-trunk/gcc/sched-ebb.c:536
0x1cb39d2 schedule_ebbs()
/repo/gcc-trunk/gcc/sched-ebb.c:655
0x1015b2c rest_of_handle_sched2
/repo/gcc-trunk/gcc/sched-rgn.c:3740
0x1015b2c execute
/repo/gcc-trunk/gcc/sched-rgn.c:3878
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-1295-20210608150918-g7a56d3d3e99-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r12-1295-20210608150918-g7a56d3d3e99-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20210608 (experimental) (GCC)

[PATCH 54/55] rs6000: Test case adjustments

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-24  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust.
* gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Adjust.
* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Adjust.
* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Adjust.
* gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Adjust.
* gcc.target/powerpc/bfp/scalar-test-neg-2.c: Adjust.
* gcc.target/powerpc/bfp/scalar-test-neg-3.c: Adjust.
* gcc.target/powerpc/bfp/scalar-test-neg-5.c: Adjust.
* gcc.target/powerpc/byte-in-set-2.c: Adjust.
* gcc.target/powerpc/cmpb-2.c: Adjust.
* gcc.target/powerpc/cmpb32-2.c: Adjust.
* gcc.target/powerpc/crypto-builtin-2.c: Adjust.
* gcc.target/powerpc/fold-vec-splat-floatdouble.c: Adjust.
* gcc.target/powerpc/fold-vec-splat-longlong.c: Adjust.
* gcc.target/powerpc/fold-vec-splat-misc-invalid.c: Adjust.
* gcc.target/powerpc/p8vector-builtin-8.c: Adjust.
* gcc.target/powerpc/pr80315-1.c: Adjust.
* gcc.target/powerpc/pr80315-2.c: Adjust.
* gcc.target/powerpc/pr80315-3.c: Adjust.
* gcc.target/powerpc/pr80315-4.c: Adjust.
* gcc.target/powerpc/pr88100.c: Adjust.
* gcc.target/powerpc/pragma_misc9.c: Adjust.
* gcc.target/powerpc/pragma_power8.c: Adjust.
* gcc.target/powerpc/pragma_power9.c: Adjust.
* gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust.
* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Adjust.
* gcc.target/powerpc/test_mffsl.c: Adjust.
* gcc.target/powerpc/vec-gnb-2.c: Adjust.
* gcc.target/powerpc/vsu/vec-all-nez-7.c: Adjust.
* gcc.target/powerpc/vsu/vec-any-eqz-7.c: Adjust.
* gcc.target/powerpc/vsu/vec-cmpnez-7.c: Adjust.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Adjust.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Adjust.
* gcc.target/powerpc/vsu/vec-xst-len-12.c: Adjust.
* gcc.target/powerpc/vsu/vec-xst-len-13.c: Adjust.
---
 .../gcc.target/powerpc/bfp/scalar-extract-exp-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-extract-sig-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-2.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-5.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-8.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-2.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-3.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-5.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb-2.c  |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb32-2.c|  2 +-
 .../gcc.target/powerpc/crypto-builtin-2.c  | 14 +++---
 .../powerpc/fold-vec-splat-floatdouble.c   |  4 ++--
 .../gcc.target/powerpc/fold-vec-splat-longlong.c   | 10 +++---
 .../powerpc/fold-vec-splat-misc-invalid.c  |  8 
 .../gcc.target/powerpc/p8vector-builtin-8.c|  1 +
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 ++--
 gcc/testsuite/gcc.target/powerpc/pragma_misc9.c|  2 +-
 gcc/testsuite/gcc.target/powerpc/pragma_power8.c   |  2 ++
 gcc/testsuite/gcc.target/powerpc/pragma_power9.c   |  3 +++
 .../powerpc/test_fpscr_drn_builtin_error.c |  4 ++--
 .../powerpc/test_fpscr_rn_builtin_error.c  | 12 ++--
 gcc/testsuite/gcc.target/powerpc/test_mffsl.c  |  3 ++-
 gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c   |  2 +-
 .../gcc.target/powerpc/vsu/vec-all-nez-7.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-any-eqz-7.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cmpnez-7.c  |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c  |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c  |  2 +-
 .../gcc.target/powerpc/vsu/vec-xl-len-13.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-xst-len-12.c|  2 +-
 35 files changed, 62 insertions(+), 59 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
index 922180675fc..53b67c95cf9 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
@@ -14,7 +14,7 @@ get_exponent (double *p)
 {
   double source = *p;
 
-  return scalar_extract_exp (source);  /* { dg-error 
"'__builtin_vec_scalar_extract_exp' is not supported in this compiler 
configuration" } */
+  return scalar_extract_exp (source);  /* { dg-error 
"'__builtin_vsx_scalar_extract_exp' requires the" } */
 }
 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c 

[PATCH 55/55] rs6000: Enable the new builtin support

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-05  Bill Schmidt  

gcc/
* config/rs6000/rs6000-gen-builtins.c (write_init_file):
Initialize new_builtins_are_live to 1.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index c3874e85592..8ca9fdc942a 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -2753,7 +2753,7 @@ write_init_file (void)
   fprintf (init_file, "#include \"rs6000-builtins.h\"\n");
   fprintf (init_file, "\n");
 
-  fprintf (init_file, "int new_builtins_are_live = 0;\n\n");
+  fprintf (init_file, "int new_builtins_are_live = 1;\n\n");
 
   fprintf (init_file, "tree rs6000_builtin_decls_x[RS6000_OVLD_MAX];\n\n");
 
-- 
2.27.0



[PATCH 53/55] rs6000: Update altivec.h for automated interfaces

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-04-01  Bill Schmidt  

gcc/
* config/rs6000/altivec.h: Delete a number of #defines that are
now superfluous; include rs6000-vecdefines.h; include some
synonyms.
---
 gcc/config/rs6000/altivec.h | 516 +++-
 1 file changed, 41 insertions(+), 475 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 961621a0841..8daf933e53e 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -55,32 +55,36 @@
 #define __CR6_LT   2
 #define __CR6_LT_REV   3
 
-/* Synonyms.  */
+#include "rs6000-vecdefines.h"
+
+/* Deprecated interfaces.  */
+#define vec_lvx vec_ld
+#define vec_lvxl vec_ldl
+#define vec_stvx vec_st
+#define vec_stvxl vec_stl
 #define vec_vaddcuw vec_addc
 #define vec_vand vec_and
 #define vec_vandc vec_andc
-#define vec_vrfip vec_ceil
 #define vec_vcmpbfp vec_cmpb
 #define vec_vcmpgefp vec_cmpge
 #define vec_vctsxs vec_cts
 #define vec_vctuxs vec_ctu
 #define vec_vexptefp vec_expte
-#define vec_vrfim vec_floor
-#define vec_lvx vec_ld
-#define vec_lvxl vec_ldl
 #define vec_vlogefp vec_loge
 #define vec_vmaddfp vec_madd
 #define vec_vmhaddshs vec_madds
-#define vec_vmladduhm vec_mladd
 #define vec_vmhraddshs vec_mradds
+#define vec_vmladduhm vec_mladd
 #define vec_vnmsubfp vec_nmsub
 #define vec_vnor vec_nor
 #define vec_vor vec_or
-#define vec_vpkpx vec_packpx
 #define vec_vperm vec_perm
-#define vec_permxor __builtin_vec_vpermxor
+#define vec_vpkpx vec_packpx
 #define vec_vrefp vec_re
+#define vec_vrfim vec_floor
 #define vec_vrfin vec_round
+#define vec_vrfip vec_ceil
+#define vec_vrfiz vec_trunc
 #define vec_vrsqrtefp vec_rsqrte
 #define vec_vsel vec_sel
 #define vec_vsldoi vec_sld
@@ -91,438 +95,56 @@
 #define vec_vspltisw vec_splat_s32
 #define vec_vsr vec_srl
 #define vec_vsro vec_sro
-#define vec_stvx vec_st
-#define vec_stvxl vec_stl
 #define vec_vsubcuw vec_subc
 #define vec_vsum2sws vec_sum2s
 #define vec_vsumsws vec_sums
-#define vec_vrfiz vec_trunc
 #define vec_vxor vec_xor
 
+#ifdef _ARCH_PWR8
+#define vec_vclz vec_cntlz
+#define vec_vgbbd vec_gb
+#define vec_vmrgew vec_mergee
+#define vec_vmrgow vec_mergeo
+#define vec_vpopcntu vec_popcnt
+#define vec_vrld vec_rl
+#define vec_vsld vec_sl
+#define vec_vsrd vec_sr
+#define vec_vsrad vec_sra
+#endif
+
+#ifdef _ARCH_PWR9
+#define vec_extract_fp_from_shorth vec_extract_fp32_from_shorth
+#define vec_extract_fp_from_shortl vec_extract_fp32_from_shortl
+#define vec_vctz vec_cnttz
+#endif
+
+/* Synonyms.  */
 /* Functions that are resolved by the backend to one of the
typed builtins.  */
-#define vec_vaddfp __builtin_vec_vaddfp
-#define vec_addc __builtin_vec_addc
-#define vec_adde __builtin_vec_adde
-#define vec_addec __builtin_vec_addec
-#define vec_vaddsws __builtin_vec_vaddsws
-#define vec_vaddshs __builtin_vec_vaddshs
-#define vec_vaddsbs __builtin_vec_vaddsbs
-#define vec_vavgsw __builtin_vec_vavgsw
-#define vec_vavguw __builtin_vec_vavguw
-#define vec_vavgsh __builtin_vec_vavgsh
-#define vec_vavguh __builtin_vec_vavguh
-#define vec_vavgsb __builtin_vec_vavgsb
-#define vec_vavgub __builtin_vec_vavgub
-#define vec_ceil __builtin_vec_ceil
-#define vec_cmpb __builtin_vec_cmpb
-#define vec_vcmpeqfp __builtin_vec_vcmpeqfp
-#define vec_cmpge __builtin_vec_cmpge
-#define vec_vcmpgtfp __builtin_vec_vcmpgtfp
-#define vec_vcmpgtsw __builtin_vec_vcmpgtsw
-#define vec_vcmpgtuw __builtin_vec_vcmpgtuw
-#define vec_vcmpgtsh __builtin_vec_vcmpgtsh
-#define vec_vcmpgtuh __builtin_vec_vcmpgtuh
-#define vec_vcmpgtsb __builtin_vec_vcmpgtsb
-#define vec_vcmpgtub __builtin_vec_vcmpgtub
-#define vec_vcfsx __builtin_vec_vcfsx
-#define vec_vcfux __builtin_vec_vcfux
-#define vec_cts __builtin_vec_cts
-#define vec_ctu __builtin_vec_ctu
-#define vec_cpsgn __builtin_vec_copysign
-#define vec_double __builtin_vec_double
-#define vec_doublee __builtin_vec_doublee
-#define vec_doubleo __builtin_vec_doubleo
-#define vec_doublel __builtin_vec_doublel
-#define vec_doubleh __builtin_vec_doubleh
-#define vec_expte __builtin_vec_expte
-#define vec_float __builtin_vec_float
-#define vec_float2 __builtin_vec_float2
-#define vec_floate __builtin_vec_floate
-#define vec_floato __builtin_vec_floato
-#define vec_floor __builtin_vec_floor
-#define vec_loge __builtin_vec_loge
-#define vec_madd __builtin_vec_madd
-#define vec_madds __builtin_vec_madds
-#define vec_mtvscr __builtin_vec_mtvscr
-#define vec_reve __builtin_vec_vreve
-#define vec_vmaxfp __builtin_vec_vmaxfp
-#define vec_vmaxsw __builtin_vec_vmaxsw
-#define vec_vmaxsh __builtin_vec_vmaxsh
-#define vec_vmaxsb __builtin_vec_vmaxsb
-#define vec_vminfp __builtin_vec_vminfp
-#define vec_vminsw __builtin_vec_vminsw
-#define vec_vminsh __builtin_vec_vminsh
-#define vec_vminsb __builtin_vec_vminsb
-#define vec_mradds __builtin_vec_mradds
-#define vec_vmsumshm __builtin_vec_vmsumshm
-#define vec_vmsumuhm __builtin_vec_vmsumuhm
-#define vec_vmsummbm __builtin_vec_vmsummbm
-#define vec_vmsumubm 

[PATCH 52/55] rs6000: Debug support

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-04-01  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_debug_type): New function.
(def_builtin): Change debug formatting for easier parsing and
include more information.
(rs6000_init_builtins): Add dump of autogenerated builtins.
(altivec_init_builtins): Dump __builtin_altivec_mask_for_load for
completeness.
---
 gcc/config/rs6000/rs6000-call.c | 193 +++-
 1 file changed, 189 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index fc61bbc2af5..3a15479f53c 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -8754,6 +8754,106 @@ rs6000_gimplify_va_arg (tree valist, tree type, 
gimple_seq *pre_p,
 
 /* Builtins.  */
 
+/* Debug utility to translate a type node to a single token.  */
+static
+const char *rs6000_debug_type (tree type)
+{
+  if (type == void_type_node)
+return "void";
+  else if (type == long_integer_type_node)
+return "long";
+  else if (type == long_unsigned_type_node)
+return "ulong";
+  else if (type == long_long_integer_type_node)
+return "longlong";
+  else if (type == long_long_unsigned_type_node)
+return "ulonglong";
+  else if (type == bool_V16QI_type_node)
+return "vbc";
+  else if (type == bool_V2DI_type_node)
+return "vbll";
+  else if (type == bool_V4SI_type_node)
+return "vbi";
+  else if (type == bool_V8HI_type_node)
+return "vbs";
+  else if (type == bool_int_type_node)
+return "bool";
+  else if (type == dfloat64_type_node)
+return "_Decimal64";
+  else if (type == double_type_node)
+return "double";
+  else if (type == intDI_type_node)
+return "sll";
+  else if (type == intHI_type_node)
+return "ss";
+  else if (type == ibm128_float_type_node)
+return "__ibm128";
+  else if (type == opaque_V4SI_type_node)
+return "opaque";
+  else if (POINTER_TYPE_P (type))
+return "void*";
+  else if (type == intQI_type_node || type == char_type_node)
+return "sc";
+  else if (type == dfloat32_type_node)
+return "_Decimal32";
+  else if (type == float_type_node)
+return "float";
+  else if (type == intSI_type_node || type == integer_type_node)
+return "si";
+  else if (type == dfloat128_type_node)
+return "_Decimal128";
+  else if (type == long_double_type_node)
+return "longdouble";
+  else if (type == intTI_type_node)
+return "sq";
+  else if (type == unsigned_intDI_type_node)
+return "ull";
+  else if (type == unsigned_intHI_type_node)
+return "us";
+  else if (type == unsigned_intQI_type_node)
+return "uc";
+  else if (type == unsigned_intSI_type_node)
+return "ui";
+  else if (type == unsigned_intTI_type_node)
+return "uq";
+  else if (type == unsigned_V16QI_type_node)
+return "vuc";
+  else if (type == unsigned_V1TI_type_node)
+return "vuq";
+  else if (type == unsigned_V2DI_type_node)
+return "vull";
+  else if (type == unsigned_V4SI_type_node)
+return "vui";
+  else if (type == unsigned_V8HI_type_node)
+return "vus";
+  else if (type == V16QI_type_node)
+return "vsc";
+  else if (type == V1TI_type_node)
+return "vsq";
+  else if (type == V2DF_type_node)
+return "vd";
+  else if (type == V2DI_type_node)
+return "vsll";
+  else if (type == V4SF_type_node)
+return "vf";
+  else if (type == V4SI_type_node)
+return "vsi";
+  else if (type == V8HI_type_node)
+return "vss";
+  else if (type == pixel_V8HI_type_node)
+return "vp";
+  else if (type == pcvoid_type_node)
+return "voidc*";
+  else if (type == float128_type_node)
+return "_Float128";
+  else if (type == vector_pair_type_node)
+return "__vector_pair";
+  else if (type == vector_quad_type_node)
+return "__vector_quad";
+  else
+return "unknown";
+}
+
 static void
 def_builtin (const char *name, tree type, enum rs6000_builtins code)
 {
@@ -8782,7 +8882,7 @@ def_builtin (const char *name, tree type, enum 
rs6000_builtins code)
   /* const function, function only depends on the inputs.  */
   TREE_READONLY (t) = 1;
   TREE_NOTHROW (t) = 1;
-  attr_string = ", const";
+  attr_string = "= const";
 }
   else if ((classify & RS6000_BTC_PURE) != 0)
 {
@@ -8790,7 +8890,7 @@ def_builtin (const char *name, tree type, enum 
rs6000_builtins code)
 external state.  */
   DECL_PURE_P (t) = 1;
   TREE_NOTHROW (t) = 1;
-  attr_string = ", pure";
+  attr_string = "= pure";
 }
   else if ((classify & RS6000_BTC_FP) != 0)
 {
@@ -8804,12 +8904,12 @@ def_builtin (const char *name, tree type, enum 
rs6000_builtins code)
{
  DECL_PURE_P (t) = 1;
  DECL_IS_NOVOPS (t) = 1;
- attr_string = ", fp, pure";
+ attr_string = "= fp, pure";
}
   else
{
  TREE_READONLY (t) = 1;
- attr_string = ", fp, const";
+ attr_string = "= fp, const";
}
 }
   

[PATCH 51/55] rs6000: Miscellaneous uses of rs6000_builtin_decls_x

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-05  Bill Schmidt  

gcc/
* config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Use
rs6000_builtin_decls_x when appropriate.
(add_condition_to_bb): Likewise.
(rs6000_atomic_assign_expand_fenv): Likewise.
---
 gcc/config/rs6000/rs6000.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9179c73f43c..db6a65a7917 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -22758,12 +22758,16 @@ rs6000_builtin_reciprocal (tree fndecl)
   if (!RS6000_RECIP_AUTO_RSQRTE_P (V2DFmode))
return NULL_TREE;
 
+  if (new_builtins_are_live)
+   return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_2DF];
   return rs6000_builtin_decls[VSX_BUILTIN_RSQRT_2DF];
 
 case VSX_BUILTIN_XVSQRTSP:
   if (!RS6000_RECIP_AUTO_RSQRTE_P (V4SFmode))
return NULL_TREE;
 
+  if (new_builtins_are_live)
+   return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_4SF];
   return rs6000_builtin_decls[VSX_BUILTIN_RSQRT_4SF];
 
 default:
@@ -25352,7 +25356,10 @@ add_condition_to_bb (tree function_decl, tree 
version_decl,
 
   tree bool_zero = build_int_cst (bool_int_type_node, 0);
   tree cond_var = create_tmp_var (bool_int_type_node);
-  tree predicate_decl = rs6000_builtin_decls [(int) 
RS6000_BUILTIN_CPU_SUPPORTS];
+  tree predicate_decl
+= (new_builtins_are_live
+   ? rs6000_builtin_decls_x[(int) RS6000_BIF_CPU_SUPPORTS]
+   : rs6000_builtin_decls [(int) RS6000_BUILTIN_CPU_SUPPORTS]);
   const char *arg_str = rs6000_clone_map[clone_isa].name;
   tree predicate_arg = build_string_literal (strlen (arg_str) + 1, arg_str);
   gimple *call_cond_stmt = gimple_build_call (predicate_decl, 1, 
predicate_arg);
@@ -27577,8 +27584,14 @@ rs6000_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
   return;
 }
 
-  tree mffs = rs6000_builtin_decls[RS6000_BUILTIN_MFFS];
-  tree mtfsf = rs6000_builtin_decls[RS6000_BUILTIN_MTFSF];
+  tree mffs
+= (new_builtins_are_live
+   ? rs6000_builtin_decls_x[RS6000_BIF_MFFS]
+   : rs6000_builtin_decls[RS6000_BUILTIN_MFFS]);
+  tree mtfsf
+= (new_builtins_are_live
+   ? rs6000_builtin_decls_x[RS6000_BIF_MTFSF]
+   : rs6000_builtin_decls[RS6000_BUILTIN_MTFSF]);
   tree call_mffs = build_call_expr (mffs, 0);
 
   /* Generates the equivalent of feholdexcept (_var)
-- 
2.27.0



[PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-06-08 Thread Bill Schmidt via Gcc-patches
This is another patch that looks bigger than it really is.  Because we
have a new namespace for the builtins, allowing us to have both the old
and new builtin infrastructure supported at once, we need versions of
these functions that use the new builtin namespace.  Otherwise the code is
unchanged.

2021-06-07  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
New forward decl.
(rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
(rs6000_new_builtin_valid_without_lhs): New function.
(rs6000_gimple_fold_new_mma_builtin): Likewise.
(rs6000_gimple_fold_new_builtin): Likewise.
---
 gcc/config/rs6000/rs6000-call.c | 1152 +++
 1 file changed, 1152 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 8f6b6b462f8..1bb9f1c255d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, 
machine_mode,
 static void rs6000_common_init_builtins (void);
 static void htm_init_builtins (void);
 static void mma_init_builtins (void);
+static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
 
 
 /* Hash table to keep track of the argument types for builtin functions.  */
@@ -11855,6 +11856,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi)
 bool
 rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 {
+  if (new_builtins_are_live)
+return rs6000_gimple_fold_new_builtin (gsi);
+
   gimple *stmt = gsi_stmt (*gsi);
   tree fndecl = gimple_call_fndecl (stmt);
   gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD);
@@ -12794,6 +12798,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/*  Helper function to sort out which built-ins may be valid without having
+a LHS.  */
+static bool
+rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code,
+ tree fndecl)
+{
+  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
+return true;
+
+  switch (fn_code)
+{
+case RS6000_BIF_STVX_V16QI:
+case RS6000_BIF_STVX_V8HI:
+case RS6000_BIF_STVX_V4SI:
+case RS6000_BIF_STVX_V4SF:
+case RS6000_BIF_STVX_V2DI:
+case RS6000_BIF_STVX_V2DF:
+case RS6000_BIF_STXVW4X_V16QI:
+case RS6000_BIF_STXVW4X_V8HI:
+case RS6000_BIF_STXVW4X_V4SF:
+case RS6000_BIF_STXVW4X_V4SI:
+case RS6000_BIF_STXVD2X_V2DF:
+case RS6000_BIF_STXVD2X_V2DI:
+  return true;
+default:
+  return false;
+}
+}
+
 /* Check whether a builtin function is supported in this target
configuration.  */
 bool
@@ -12885,6 +12918,1125 @@ rs6000_new_builtin_is_supported_p (enum 
rs6000_gen_builtins fncode)
   return true;
 }
 
+/* Expand the MMA built-ins early, so that we can convert the pass-by-reference
+   __vector_quad arguments into pass-by-value arguments, leading to more
+   efficient code generation.  */
+static bool
+rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
+   rs6000_gen_builtins fn_code)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  size_t fncode = (size_t) fn_code;
+
+  if (!bif_is_mma (rs6000_builtin_info_x[fncode]))
+return false;
+
+  /* Each call that can be gimple-expanded has an associated built-in
+ function that it will expand into.  If this one doesn't, we have
+ already expanded it!  */
+  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
+return false;
+
+  bifdata *bd = _builtin_info_x[fncode];
+  unsigned nopnds = bd->nargs;
+  gimple_seq new_seq = NULL;
+  gimple *new_call;
+  tree new_decl;
+
+  /* Compatibility built-ins; we used to call these
+ __builtin_mma_{dis,}assemble_pair, but now we call them
+ __builtin_vsx_{dis,}assemble_pair.  Handle the old verions.  */
+  if (fncode == RS6000_BIF_ASSEMBLE_PAIR)
+fncode = RS6000_BIF_ASSEMBLE_PAIR_V;
+  else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR)
+fncode = RS6000_BIF_DISASSEMBLE_PAIR_V;
+
+  if (fncode == RS6000_BIF_DISASSEMBLE_ACC
+  || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V)
+{
+  /* This is an MMA disassemble built-in function.  */
+  push_gimplify_context (true);
+  unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
+  tree dst_ptr = gimple_call_arg (stmt, 0);
+  tree src_ptr = gimple_call_arg (stmt, 1);
+  tree src_type = TREE_TYPE (src_ptr);
+  tree src = make_ssa_name (TREE_TYPE (src_type));
+  gimplify_assign (src, build_simple_mem_ref (src_ptr), _seq);
+
+  /* If we are not disassembling an accumulator/pair or our destination is
+another accumulator/pair, then just copy the entire thing as is.  */
+  if ((fncode == RS6000_BIF_DISASSEMBLE_ACC
+  && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_quad_type_node)
+ || (fncode == RS6000_BIF_DISASSEMBLE_PAIR_V
+

[PATCH 49/55] rs6000: Builtin expansion, part 6

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-24  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (new_htm_spr_num): New function.
(new_htm_expand_builtin): Implement.
(rs6000_expand_new_builtin): Handle 32-bit and endian cases.
---
 gcc/config/rs6000/rs6000-call.c | 202 
 1 file changed, 202 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f4b0c00aab4..53e51b17ab3 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14911,11 +14911,171 @@ new_mma_expand_builtin (tree exp, rtx target, 
insn_code icode)
   return target;
 }
 
+/* Return the appropriate SPR number associated with the given builtin.  */
+static inline HOST_WIDE_INT
+new_htm_spr_num (enum rs6000_gen_builtins code)
+{
+  if (code == RS6000_BIF_GET_TFHAR
+  || code == RS6000_BIF_SET_TFHAR)
+return TFHAR_SPR;
+  else if (code == RS6000_BIF_GET_TFIAR
+  || code == RS6000_BIF_SET_TFIAR)
+return TFIAR_SPR;
+  else if (code == RS6000_BIF_GET_TEXASR
+  || code == RS6000_BIF_SET_TEXASR)
+return TEXASR_SPR;
+  gcc_assert (code == RS6000_BIF_GET_TEXASRU
+ || code == RS6000_BIF_SET_TEXASRU);
+  return TEXASRU_SPR;
+}
+
 /* Expand the HTM builtin in EXP and store the result in TARGET.  */
 static rtx
 new_htm_expand_builtin (bifdata *bifaddr, rs6000_gen_builtins fcode,
tree exp, rtx target)
 {
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  bool nonvoid = TREE_TYPE (TREE_TYPE (fndecl)) != void_type_node;
+
+  if (!TARGET_POWERPC64
+  && (fcode == RS6000_BIF_TABORTDC
+ || fcode == RS6000_BIF_TABORTDCI))
+{
+  error ("builtin %qs is only valid in 64-bit mode", bifaddr->bifname);
+  return const0_rtx;
+}
+
+  rtx op[MAX_HTM_OPERANDS], pat;
+  int nopnds = 0;
+  tree arg;
+  call_expr_arg_iterator iter;
+  insn_code icode = bifaddr->icode;
+  bool uses_spr = bif_is_htmspr (*bifaddr);
+  rtx cr = NULL_RTX;
+
+  if (uses_spr)
+icode = rs6000_htm_spr_icode (nonvoid);
+  const insn_operand_data *insn_op = _data[icode].operand[0];
+
+  if (nonvoid)
+{
+  machine_mode tmode = (uses_spr) ? insn_op->mode : E_SImode;
+  if (!target
+ || GET_MODE (target) != tmode
+ || (uses_spr && !(*insn_op->predicate) (target, tmode)))
+   target = gen_reg_rtx (tmode);
+  if (uses_spr)
+   op[nopnds++] = target;
+}
+
+  FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
+{
+  if (arg == error_mark_node || nopnds >= MAX_HTM_OPERANDS)
+   return const0_rtx;
+
+  insn_op = _data[icode].operand[nopnds];
+  op[nopnds] = expand_normal (arg);
+
+  if (!(*insn_op->predicate) (op[nopnds], insn_op->mode))
+   {
+ if (!strcmp (insn_op->constraint, "n"))
+   {
+ int arg_num = (nonvoid) ? nopnds : nopnds + 1;
+ if (!CONST_INT_P (op[nopnds]))
+   error ("argument %d must be an unsigned literal", arg_num);
+ else
+   error ("argument %d is an unsigned literal that is "
+  "out of range", arg_num);
+ return const0_rtx;
+   }
+ op[nopnds] = copy_to_mode_reg (insn_op->mode, op[nopnds]);
+   }
+
+  nopnds++;
+}
+
+  /* Handle the builtins for extended mnemonics.  These accept
+ no arguments, but map to builtins that take arguments.  */
+  switch (fcode)
+{
+case RS6000_BIF_TENDALL:  /* Alias for: tend. 1  */
+case RS6000_BIF_TRESUME:  /* Alias for: tsr. 1  */
+  op[nopnds++] = GEN_INT (1);
+  break;
+case RS6000_BIF_TSUSPEND: /* Alias for: tsr. 0  */
+  op[nopnds++] = GEN_INT (0);
+  break;
+default:
+  break;
+}
+
+  /* If this builtin accesses SPRs, then pass in the appropriate
+ SPR number and SPR regno as the last two operands.  */
+  if (uses_spr)
+{
+  machine_mode mode = (TARGET_POWERPC64) ? DImode : SImode;
+  op[nopnds++] = gen_rtx_CONST_INT (mode, new_htm_spr_num (fcode));
+}
+  /* If this builtin accesses a CR, then pass in a scratch
+ CR as the last operand.  */
+  else if (bif_is_htmcr (*bifaddr))
+{
+  cr = gen_reg_rtx (CCmode);
+  op[nopnds++] = cr;
+}
+
+  switch (nopnds)
+{
+case 1:
+  pat = GEN_FCN (icode) (op[0]);
+  break;
+case 2:
+  pat = GEN_FCN (icode) (op[0], op[1]);
+  break;
+case 3:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2]);
+  break;
+case 4:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]);
+  break;
+default:
+  gcc_unreachable ();
+}
+  if (!pat)
+return NULL_RTX;
+  emit_insn (pat);
+
+  if (bif_is_htmcr (*bifaddr))
+{
+  if (fcode == RS6000_BIF_TBEGIN)
+   {
+ /* Emit code to set TARGET to true or false depending on
+whether the tbegin. instruction successfully or failed
+to start a transaction.  We do this by placing the 1's
+complement of CR's 

[PATCH 50/55] rs6000: Update rs6000_builtin_decl

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-05  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_new_builtin_decl): New
function.
(rs6000_builtin_decl): Call it.
---
 gcc/config/rs6000/rs6000-call.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 53e51b17ab3..fc61bbc2af5 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -16095,11 +16095,31 @@ rs6000_init_builtins (void)
 }
 }
 
+static tree
+rs6000_new_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
+{
+  rs6000_gen_builtins fcode = (rs6000_gen_builtins) code;
+
+  if (fcode >= RS6000_OVLD_MAX)
+return error_mark_node;
+
+  if (!rs6000_new_builtin_is_supported_p (fcode))
+{
+  rs6000_invalid_new_builtin (fcode);
+  return error_mark_node;
+}
+
+  return rs6000_builtin_decls_x[code];
+}
+
 /* Returns the rs6000 builtin decl for CODE.  */
 
 tree
 rs6000_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
 {
+  if (new_builtins_are_live)
+return rs6000_new_builtin_decl (code, initialize_p);
+
   HOST_WIDE_INT fnmask;
 
   if (code >= RS6000_BUILTIN_COUNT)
-- 
2.27.0



[PATCH 48/55] rs6000: Builtin expansion, part 5

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-25  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (new_mma_expand_builtin):
Implement.
---
 gcc/config/rs6000/rs6000-call.c | 92 +
 1 file changed, 92 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 00fd4bb95ab..f4b0c00aab4 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14816,6 +14816,98 @@ stv_expand_builtin (insn_code icode, rtx *op,
 static rtx
 new_mma_expand_builtin (tree exp, rtx target, insn_code icode)
 {
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  tree arg;
+  call_expr_arg_iterator iter;
+  const struct insn_operand_data *insn_op;
+  rtx op[MAX_MMA_OPERANDS];
+  unsigned nopnds = 0;
+  bool void_func = TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node;
+  machine_mode tmode = VOIDmode;
+
+  if (!void_func)
+{
+  tmode = insn_data[icode].operand[0].mode;
+  if (!target
+ || GET_MODE (target) != tmode
+ || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+   target = gen_reg_rtx (tmode);
+  op[nopnds++] = target;
+}
+  else
+target = const0_rtx;
+
+  FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
+{
+  if (arg == error_mark_node)
+   return const0_rtx;
+
+  rtx opnd;
+  insn_op = _data[icode].operand[nopnds];
+  if (TREE_CODE (arg) == ADDR_EXPR
+ && MEM_P (DECL_RTL (TREE_OPERAND (arg, 0
+   opnd = DECL_RTL (TREE_OPERAND (arg, 0));
+  else
+   opnd = expand_normal (arg);
+
+  if (!(*insn_op->predicate) (opnd, insn_op->mode))
+   {
+ if (!strcmp (insn_op->constraint, "n"))
+   {
+ if (!CONST_INT_P (opnd))
+   error ("argument %d must be an unsigned literal", nopnds);
+ else
+   error ("argument %d is an unsigned literal that is "
+  "out of range", nopnds);
+ return const0_rtx;
+   }
+ opnd = copy_to_mode_reg (insn_op->mode, opnd);
+   }
+
+  /* Some MMA instructions have INOUT accumulator operands, so force
+their target register to be the same as their input register.  */
+  if (!void_func
+ && nopnds == 1
+ && !strcmp (insn_op->constraint, "0")
+ && insn_op->mode == tmode
+ && REG_P (opnd)
+ && (*insn_data[icode].operand[0].predicate) (opnd, tmode))
+   target = op[0] = opnd;
+
+  op[nopnds++] = opnd;
+}
+
+  rtx pat;
+  switch (nopnds)
+{
+case 1:
+  pat = GEN_FCN (icode) (op[0]);
+  break;
+case 2:
+  pat = GEN_FCN (icode) (op[0], op[1]);
+  break;
+case 3:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2]);
+  break;
+case 4:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]);
+  break;
+case 5:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4]);
+  break;
+case 6:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5]);
+  break;
+case 7:
+  pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5], op[6]);
+  break;
+default:
+  gcc_unreachable ();
+}
+  if (!pat)
+return NULL_RTX;
+  emit_insn (pat);
+
   return target;
 }
 
-- 
2.27.0



[PATCH 47/55] rs6000: Builtin expansion, part 4

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-05  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (elemrev_icode): Implement.
(ldv_expand_builtin): Likewise.
(lxvrse_expand_builtin): Likewise.
(lxvrze_expand_builtin): Likewise.
(stv_expand_builtin): Likewise.
---
 gcc/config/rs6000/rs6000-call.c | 217 
 1 file changed, 217 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index c1c936f62b7..00fd4bb95ab 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14565,12 +14565,114 @@ new_cpu_expand_builtin (enum rs6000_gen_builtins 
fcode,
 static insn_code
 elemrev_icode (rs6000_gen_builtins fcode)
 {
+  switch (fcode)
+{
+default:
+  gcc_unreachable ();
+case RS6000_BIF_ST_ELEMREV_V1TI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v1ti
+ : CODE_FOR_vsx_st_elemrev_v1ti);
+case RS6000_BIF_ST_ELEMREV_V2DF:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2df
+ : CODE_FOR_vsx_st_elemrev_v2df);
+case RS6000_BIF_ST_ELEMREV_V2DI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2di
+ : CODE_FOR_vsx_st_elemrev_v2di);
+case RS6000_BIF_ST_ELEMREV_V4SF:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4sf
+ : CODE_FOR_vsx_st_elemrev_v4sf);
+case RS6000_BIF_ST_ELEMREV_V4SI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4si
+ : CODE_FOR_vsx_st_elemrev_v4si);
+case RS6000_BIF_ST_ELEMREV_V8HI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v8hi
+ : CODE_FOR_vsx_st_elemrev_v8hi);
+case RS6000_BIF_ST_ELEMREV_V16QI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v16qi
+ : CODE_FOR_vsx_st_elemrev_v16qi);
+case RS6000_BIF_LD_ELEMREV_V2DF:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2df
+ : CODE_FOR_vsx_ld_elemrev_v2df);
+case RS6000_BIF_LD_ELEMREV_V1TI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v1ti
+ : CODE_FOR_vsx_ld_elemrev_v1ti);
+case RS6000_BIF_LD_ELEMREV_V2DI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2di
+ : CODE_FOR_vsx_ld_elemrev_v2di);
+case RS6000_BIF_LD_ELEMREV_V4SF:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4sf
+ : CODE_FOR_vsx_ld_elemrev_v4sf);
+case RS6000_BIF_LD_ELEMREV_V4SI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4si
+ : CODE_FOR_vsx_ld_elemrev_v4si);
+case RS6000_BIF_LD_ELEMREV_V8HI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v8hi
+ : CODE_FOR_vsx_ld_elemrev_v8hi);
+case RS6000_BIF_LD_ELEMREV_V16QI:
+  return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v16qi
+ : CODE_FOR_vsx_ld_elemrev_v16qi);
+}
+  gcc_unreachable ();
   return (insn_code) 0;
 }
 
 static rtx
 ldv_expand_builtin (rtx target, insn_code icode, rtx *op, machine_mode tmode)
 {
+  rtx pat, addr;
+  bool blk = (icode == CODE_FOR_altivec_lvlx
+ || icode == CODE_FOR_altivec_lvlxl
+ || icode == CODE_FOR_altivec_lvrx
+ || icode == CODE_FOR_altivec_lvrxl);
+
+  if (target == 0
+  || GET_MODE (target) != tmode
+  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
+target = gen_reg_rtx (tmode);
+
+  op[1] = copy_to_mode_reg (Pmode, op[1]);
+
+  /* For LVX, express the RTL accurately by ANDing the address with -16.
+ LVXL and LVE*X expand to use UNSPECs to hide their special behavior,
+ so the raw address is fine.  */
+  if (icode == CODE_FOR_altivec_lvx_v1ti
+  || icode == CODE_FOR_altivec_lvx_v2df
+  || icode == CODE_FOR_altivec_lvx_v2di
+  || icode == CODE_FOR_altivec_lvx_v4sf
+  || icode == CODE_FOR_altivec_lvx_v4si
+  || icode == CODE_FOR_altivec_lvx_v8hi
+  || icode == CODE_FOR_altivec_lvx_v16qi)
+{
+  rtx rawaddr;
+  if (op[0] == const0_rtx)
+   rawaddr = op[1];
+  else
+   {
+ op[0] = copy_to_mode_reg (Pmode, op[0]);
+ rawaddr = gen_rtx_PLUS (Pmode, op[1], op[0]);
+   }
+  addr = gen_rtx_AND (Pmode, rawaddr, gen_rtx_CONST_INT (Pmode, -16));
+  addr = gen_rtx_MEM (blk ? BLKmode : tmode, addr);
+
+  emit_insn (gen_rtx_SET (target, addr));
+}
+  else
+{
+  if (op[0] == const0_rtx)
+   addr = gen_rtx_MEM (blk ? BLKmode : tmode, op[1]);
+  else
+   {
+ op[0] = copy_to_mode_reg (Pmode, op[0]);
+ addr = gen_rtx_MEM (blk ? BLKmode : tmode,
+ gen_rtx_PLUS (Pmode, op[1], op[0]));
+   }
+
+  pat = GEN_FCN (icode) (target, addr);
+  if (! pat)
+   return 0;
+  emit_insn (pat);
+}
+
   return target;
 }
 
@@ -14578,6 +14680,42 @@ static rtx
 lxvrse_expand_builtin (rtx target, insn_code icode, rtx *op,
   machine_mode tmode, machine_mode smode)
 {
+  rtx pat, addr;
+  op[1] = copy_to_mode_reg (Pmode, op[1]);
+
+  if (op[0] == const0_rtx)
+addr = 

[PATCH 46/55] rs6000: Builtin expansion, part 3

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-05  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (new_cpu_expand_builtin):
Implement.
---
 gcc/config/rs6000/rs6000-call.c | 100 
 1 file changed, 100 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index dd24e808c97..c1c936f62b7 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14459,6 +14459,106 @@ static rtx
 new_cpu_expand_builtin (enum rs6000_gen_builtins fcode,
tree exp ATTRIBUTE_UNUSED, rtx target)
 {
+  /* __builtin_cpu_init () is a nop, so expand to nothing.  */
+  if (fcode == RS6000_BIF_CPU_INIT)
+return const0_rtx;
+
+  if (target == 0 || GET_MODE (target) != SImode)
+target = gen_reg_rtx (SImode);
+
+#ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
+  tree arg = TREE_OPERAND (CALL_EXPR_ARG (exp, 0), 0);
+  /* Target clones creates an ARRAY_REF instead of STRING_CST, convert it back
+ to a STRING_CST.  */
+  if (TREE_CODE (arg) == ARRAY_REF
+  && TREE_CODE (TREE_OPERAND (arg, 0)) == STRING_CST
+  && TREE_CODE (TREE_OPERAND (arg, 1)) == INTEGER_CST
+  && compare_tree_int (TREE_OPERAND (arg, 1), 0) == 0)
+arg = TREE_OPERAND (arg, 0);
+
+  if (TREE_CODE (arg) != STRING_CST)
+{
+  error ("builtin %qs only accepts a string argument",
+rs6000_builtin_info_x[(size_t) fcode].bifname);
+  return const0_rtx;
+}
+
+  if (fcode == RS6000_BIF_CPU_IS)
+{
+  const char *cpu = TREE_STRING_POINTER (arg);
+  rtx cpuid = NULL_RTX;
+  for (size_t i = 0; i < ARRAY_SIZE (cpu_is_info); i++)
+   if (strcmp (cpu, cpu_is_info[i].cpu) == 0)
+ {
+   /* The CPUID value in the TCB is offset by _DL_FIRST_PLATFORM.  */
+   cpuid = GEN_INT (cpu_is_info[i].cpuid + _DL_FIRST_PLATFORM);
+   break;
+ }
+  if (cpuid == NULL_RTX)
+   {
+ /* Invalid CPU argument.  */
+ error ("cpu %qs is an invalid argument to builtin %qs",
+cpu, rs6000_builtin_info_x[(size_t) fcode].bifname);
+ return const0_rtx;
+   }
+
+  rtx platform = gen_reg_rtx (SImode);
+  rtx tcbmem = gen_const_mem (SImode,
+ gen_rtx_PLUS (Pmode,
+   gen_rtx_REG (Pmode, TLS_REGNUM),
+   GEN_INT (TCB_PLATFORM_OFFSET)));
+  emit_move_insn (platform, tcbmem);
+  emit_insn (gen_eqsi3 (target, platform, cpuid));
+}
+  else if (fcode == RS6000_BIF_CPU_SUPPORTS)
+{
+  const char *hwcap = TREE_STRING_POINTER (arg);
+  rtx mask = NULL_RTX;
+  int hwcap_offset;
+  for (size_t i = 0; i < ARRAY_SIZE (cpu_supports_info); i++)
+   if (strcmp (hwcap, cpu_supports_info[i].hwcap) == 0)
+ {
+   mask = GEN_INT (cpu_supports_info[i].mask);
+   hwcap_offset = TCB_HWCAP_OFFSET (cpu_supports_info[i].id);
+   break;
+ }
+  if (mask == NULL_RTX)
+   {
+ /* Invalid HWCAP argument.  */
+ error ("%s %qs is an invalid argument to builtin %qs",
+"hwcap", hwcap,
+rs6000_builtin_info_x[(size_t) fcode].bifname);
+ return const0_rtx;
+   }
+
+  rtx tcb_hwcap = gen_reg_rtx (SImode);
+  rtx tcbmem = gen_const_mem (SImode,
+ gen_rtx_PLUS (Pmode,
+   gen_rtx_REG (Pmode, TLS_REGNUM),
+   GEN_INT (hwcap_offset)));
+  emit_move_insn (tcb_hwcap, tcbmem);
+  rtx scratch1 = gen_reg_rtx (SImode);
+  emit_insn (gen_rtx_SET (scratch1, gen_rtx_AND (SImode, tcb_hwcap, 
mask)));
+  rtx scratch2 = gen_reg_rtx (SImode);
+  emit_insn (gen_eqsi3 (scratch2, scratch1, const0_rtx));
+  emit_insn (gen_rtx_SET (target, gen_rtx_XOR (SImode, scratch2, 
const1_rtx)));
+}
+  else
+gcc_unreachable ();
+
+  /* Record that we have expanded a CPU builtin, so that we can later
+ emit a reference to the special symbol exported by LIBC to ensure we
+ do not link against an old LIBC that doesn't support this feature.  */
+  cpu_builtin_p = true;
+
+#else
+  warning (0, "builtin %qs needs GLIBC (2.23 and newer) that exports hardware "
+  "capability bits", rs6000_builtin_info_x[(size_t) fcode].bifname);
+
+  /* For old LIBCs, always return FALSE.  */
+  emit_move_insn (target, GEN_INT (0));
+#endif /* TARGET_LIBC_PROVIDES_HWCAP_IN_TCB */
+
   return target;
 }
 
-- 
2.27.0



[PATCH 45/55] rs6000: Builtin expansion, part 2

2021-06-08 Thread Bill Schmidt via Gcc-patches
2021-03-05  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin):
Implement.
(rs6000_expand_ldst_mask): Likewise.
(rs6000_init_builtins): Initialize altivec_builtin_mask_for_load.
---
 gcc/config/rs6000/rs6000-call.c | 101 +++-
 1 file changed, 100 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 9493beca0ae..dd24e808c97 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11534,6 +11534,75 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode)
 static void
 rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode)
 {
+  size_t uns_fncode = (size_t) fncode;
+  const char *name = rs6000_builtin_info_x[uns_fncode].bifname;
+
+  switch (rs6000_builtin_info_x[uns_fncode].enable)
+{
+case ENB_P5:
+  error ("%qs requires the %qs option", name, "-mcpu=power5");
+  break;
+case ENB_P6:
+  error ("%qs requires the %qs option", name, "-mcpu=power6");
+  break;
+case ENB_ALTIVEC:
+  error ("%qs requires the %qs option", name, "-maltivec");
+  break;
+case ENB_CELL:
+  error ("%qs is only valid for the cell processor", name);
+  break;
+case ENB_VSX:
+  error ("%qs requires the %qs option", name, "-mvsx");
+  break;
+case ENB_P7:
+  error ("%qs requires the %qs option", name, "-mcpu=power7");
+  break;
+case ENB_P7_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=power7", "-m64", "-mpowerpc64");
+  break;
+case ENB_P8:
+  error ("%qs requires the %qs option", name, "-mcpu=power8");
+  break;
+case ENB_P8V:
+  error ("%qs requires the %qs option", name, "-mpower8-vector");
+  break;
+case ENB_P9:
+  error ("%qs requires the %qs option", name, "-mcpu=power9");
+  break;
+case ENB_P9_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=power9", "-m64", "-mpowerpc64");
+  break;
+case ENB_P9V:
+  error ("%qs requires the %qs option", name, "-mpower9-vector");
+  break;
+case ENB_IEEE128_HW:
+  error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name);
+  break;
+case ENB_DFP:
+  error ("%qs requires the %qs option", name, "-mhard-dfp");
+  break;
+case ENB_CRYPTO:
+  error ("%qs requires the %qs option", name, "-mcrypto");
+  break;
+case ENB_HTM:
+  error ("%qs requires the %qs option", name, "-mhtm");
+  break;
+case ENB_P10:
+  error ("%qs requires the %qs option", name, "-mcpu=power10");
+  break;
+case ENB_P10_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=power10", "-m64", "-mpowerpc64");
+  break;
+case ENB_MMA:
+  error ("%qs requires the %qs option", name, "-mmma");
+  break;
+default:
+case ENB_ALWAYS:
+  gcc_unreachable ();
+};
 }
 
 /* Target hook for early folding of built-ins, shamelessly stolen
@@ -14356,7 +14425,33 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
subtarget ATTRIBUTE_UNUSED,
 rtx
 rs6000_expand_ldst_mask (rtx target, tree arg0)
  {
-  return target;
+  int icode2 = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr_direct
+   : (int) CODE_FOR_altivec_lvsl_direct);
+  machine_mode tmode = insn_data[icode2].operand[0].mode;
+  machine_mode mode = insn_data[icode2].operand[1].mode;
+  rtx op, addr, pat;
+
+  gcc_assert (TARGET_ALTIVEC);
+
+  gcc_assert (POINTER_TYPE_P (TREE_TYPE (arg0)));
+  op = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL);
+  addr = memory_address (mode, op);
+  /* We need to negate the address.  */
+  op = gen_reg_rtx (GET_MODE (addr));
+  emit_insn (gen_rtx_SET (op, gen_rtx_NEG (GET_MODE (addr), addr)));
+  op = gen_rtx_MEM (mode, op);
+
+  if (target == 0
+  || GET_MODE (target) != tmode
+  || ! (*insn_data[icode2].operand[0].predicate) (target, tmode))
+target = gen_reg_rtx (tmode);
+
+  pat = GEN_FCN (icode2) (target, op);
+  if (!pat)
+return 0;
+  emit_insn (pat);
+
+   return target;
  }
 
 /* Expand the CPU builtin in FCODE and store the result in TARGET.  */
@@ -15249,6 +15344,10 @@ rs6000_init_builtins (void)
   /* Execute the autogenerated initialization code for builtins.  */
   rs6000_autoinit_builtins ();
 
+  if (new_builtins_are_live)
+altivec_builtin_mask_for_load
+  = rs6000_builtin_decls_x[RS6000_BIF_MASK_FOR_LOAD];
+
   if (new_builtins_are_live)
 {
 #ifdef SUBTARGET_INIT_BUILTINS
-- 
2.27.0



  1   2   3   4   >