[PATCH][DOCS][pushed] Improve JSON examples.
The patch improves JSON examples so that they are a valid JSON. That will help us with syntax highlighting in Sphinx-generated documentation. Pushed to master. Martin gcc/ChangeLog: * doc/gcov.texi: Create a proper JSON files. * doc/invoke.texi: Remove dots in order to make it a valid JSON object. --- gcc/doc/gcov.texi | 50 ++--- gcc/doc/invoke.texi | 3 +-- 2 files changed, 26 insertions(+), 27 deletions(-) diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi index 32b51f984bc..6a5760e5ebe 100644 --- a/gcc/doc/gcov.texi +++ b/gcc/doc/gcov.texi @@ -191,11 +191,11 @@ Structure of the JSON is following: @smallexample @{ - "current_working_directory": @var{current_working_directory}, - "data_file": @var{data_file}, - "format_version": @var{format_version}, - "gcc_version": @var{gcc_version} - "files": [@var{file}] + "current_working_directory": "foo/bar", + "data_file": "a.out", + "format_version": "1", + "gcc_version": "11.1.1 20210510" + "files": ["$file"] @} @end smallexample @@ -220,9 +220,9 @@ Each @var{file} has the following form: @smallexample @{ - "file": @var{file_name}, - "functions": [@var{function}], - "lines": [@var{line}] + "file": "a.c", + "functions": ["$function"], + "lines": ["$line"] @} @end smallexample @@ -237,15 +237,15 @@ Each @var{function} has the following form: @smallexample @{ - "blocks": @var{blocks}, - "blocks_executed": @var{blocks_executed}, - "demangled_name": "@var{demangled_name}, - "end_column": @var{end_column}, - "end_line": @var{end_line}, - "execution_count": @var{execution_count}, - "name": @var{name}, - "start_column": @var{start_column} - "start_line": @var{start_line} + "blocks": 2, + "blocks_executed": 2, + "demangled_name": "foo", + "end_column": 1, + "end_line": 4, + "execution_count": 1, + "name": "foo", + "start_column": 5, + "start_line": 1 @} @end smallexample @@ -289,11 +289,11 @@ Each @var{line} has the following form: @smallexample @{ - "branches": [@var{branch}], - "count": @var{count}, - "line_number": @var{line_number}, - "unexecuted_block": @var{unexecuted_block} - "function_name": @var{function_name}, + "branches": ["$branch"], + "count": 2, + "line_number": 15, + "unexecuted_block": false, + "function_name": "foo", @} @end smallexample @@ -320,9 +320,9 @@ Each @var{branch} has the following form: @smallexample @{ - "count": @var{count}, - "fallthrough": @var{fallthrough}, - "throw": @var{throw} + "count": 11, + "fallthrough": true, + "throw": false @} @end smallexample diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 6063e466c13..24dc0491901 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -5149,8 +5149,7 @@ might be printed in JSON form (after formatting) like this: @} ] "column-origin": 1, -@}, -@dots{} +@} ] @end smallexample -- 2.31.1
Re: GCC Mission Statement
On 6/9/21 10:39 AM, Valentino Giudice wrote: I was aware of that announcement, but it doesn't mention the mission statement at all. It appears that the decision in question was, at the time, in contrast with the mission statement (rather than guided by it). If the Steering Committee updates the mission statement, it may appear that the mission statement follows the decisions of the steering committee (in place of the contrary). In that case, what would be the purpose of a mission statement? The mission statement was also updated beyond simply making it consistent with the change: in "Supporting the goals of the GNU project, as defined by the FSF" the reference to the FSF was removed. Quite a few projects under the GNU project[1] have dissociated themselves from the FSF, so "as defined by the FSF" perhaps doesn't apply as consistently as it did before. That is my understanding anyway; maybe there's more context that others may be able to add. Siddhesh [1] https://gnu.tools/en/software/
[Bug fortran/100961] Intrinsic function as value to a class(*) assumed rank argument fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100961 --- Comment #3 from martin --- Created attachment 50968 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50968=edit second test case which segfaults Playing around with some variants in select_rank_expression2.f90, I can see that I sometimes get correct results, sometimes the rank is correctly recognised, but not the type, and (as is the case for attachment select_rank_expression2.f90) it even can segfault with an invalid memory access. I get all these behaviours by selecting different sets of the four "call p(..)" lines and varying the order in which they are executed.
[RFC/PATCH] ira: Consider matching constraints with param [PR100328]
Hi, PR100328 has some details about this issue, I am trying to brief it here. In the hottest function LBM_performStreamCollideTRT of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions (27 FMA, 19 FMS, 11 FNMA). On rs6000, this kind of FMA style insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg class have 64 registers whose foregoing 32 ones make up the whole FLOAT_REG. There are some differences for these two flavors, taking "*fma4_fpr" as example: (define_insn "*fma4_fpr" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa") (fma:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa") (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0") (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))] // wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG. // (f/d) => A floating point register, aka. FLOAT_REG. So for VSX_REG, we only have the destructive form, when VSX_REG alternative being used, the operand 2 or operand 3 is required to be the same as operand 0. reload has to take care of this constraint and create some non-free register copies if required. Assuming one fma insn looks like: op0 = FMA (op1, op2, op3) The best regclass of them are VSX_REG, when op1,op2,op3 are all dead, IRA simply creates three shuffle copies for them (here the operand order matters, since with the same freq, the one with smaller number takes preference), but IMO both op2 and op3 should take higher priority in copy queue due to the matching constraint. I noticed that there is one function ira_get_dup_out_num, which meant to create this kind of constraint copy, but the below code looks to refuse to create if there is an alternative which has valid regclass without spilled need. default: { enum constraint_num cn = lookup_constraint (str); enum reg_class cl = reg_class_for_constraint (cn); if (cl != NO_REGS && !targetm.class_likely_spilled_p (cl)) goto fail ... I cooked one patch attached to make ira respect this kind of matching constraint guarded with one parameter. As I stated in the PR, I was not sure this is on the right track. The RFC patch is to check the matching constraint in all alternatives, if there is one alternative with matching constraint and matches the current preferred regclass (or best of allocno?), it will record the output operand number and further create one constraint copy for it. Normally it can get the priority against shuffle copies and the matching constraint will get satisfied with higher possibility, reload doesn't create extra copies to meet the matching constraint or the desirable register class when it has to. For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay as shuffle copies, and later any of A,B,C,D gets assigned by one hardware register which is a VSX register (VSX_REG) but not a FP register (FLOAT_REG), which means it has to pay costs once we can NOT go with VSX alternatives, so at that time it's important to respect the matching constraint then we can increase the freq for the remaining copies related to this (A/B, A/C, A/D). This idea requires some side tables to record some information and seems a bit complicated in the current framework, so the proposed patch aggressively emphasizes the matching constraint at the time of creating copies. Any comments are highly appreciated! BR, Kewen --- gcc/config/rs6000/rs6000.c | 3 ++ gcc/ira.c | 69 ++ gcc/params.opt | 4 +++ 3 files changed, 70 insertions(+), 6 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 5ae40d6f4ce..eb9c4284f91 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4852,6 +4852,9 @@ rs6000_option_override_internal (bool global_init_p) ap = __builtin_next_arg (0). */ if (DEFAULT_ABI != ABI_V4) targetm.expand_builtin_va_start = NULL; + + SET_OPTION_IF_UNSET (_options, _options_set, + param_ira_consider_dup_in_all_alts, 1); } rs6000_override_options_after_change (); diff --git a/gcc/ira.c b/gcc/ira.c index b93588d8a9f..beebee7499b 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -1937,10 +1939,16 @@ ira_get_dup_out_num (int op_num, alternative_mask alts) return -1; str = recog_data.constraints[op_num]; use_commut_op_p = false; + + rtx op = recog_data.operand[op_num]; + int op_no = reg_or_subregno (op); + enum reg_class op_pref_cl = reg_preferred_class (op_no); + machine_mode op_mode = GET_MODE (op); + for (;;) { - rtx op = recog_data.operand[op_num]; - + bool saw_reg_cstr = false; + for (curr_alt = 0, ignore_p = !TEST_BIT (alts, curr_alt), original = -1;;) { @@ -1963,9 +1971,25 @@ ira_get_dup_out_num (int op_num, alternative_mask alts) { enum constraint_num cn =
[Bug fortran/100961] Intrinsic function as value to a class(*) assumed rank argument fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100961 --- Comment #2 from martin --- It is releases/gcc-11.1.0: Using built-in specs. COLLECT_GCC=gfortran-11 COLLECT_LTO_WRAPPER=.../gcc/lib/gcc/x86_64-linux-gnu/11.1.0/lto-wrapper Target: x86_64-linux-gnu Configured with: ../gcc-repo/configure --program-suffix=-11 --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-arch=westmere --prefix=.../gcc --enable-languages=c,c++,fortran --disable-multilib --disable-bootstrap --enable-checking=release Thread model: posix Supported LTO compression algorithms: zlib gcc version 11.1.0 (GCC) The code is compiled with "-g select_rank_expression.f90 -o select_rank_expression.x".
[Bug target/100085] Bad code for union transfer from __float128 to vector types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #10 from luoxhu at gcc dot gnu.org --- float128 to vector __int128 is fixed by: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f700e4b0ee3ef53b48975cf89be26b9177e3a3f3
Re: GCC Mission Statement
Thank you. > Well there was an announcement; the changes in the mission statement reflect > the new reality introduced by that announcement: > > https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html > > Siddhesh I was aware of that announcement, but it doesn't mention the mission statement at all. It appears that the decision in question was, at the time, in contrast with the mission statement (rather than guided by it). If the Steering Committee updates the mission statement, it may appear that the mission statement follows the decisions of the steering committee (in place of the contrary). In that case, what would be the purpose of a mission statement? The mission statement was also updated beyond simply making it consistent with the change: in "Supporting the goals of the GNU project, as defined by the FSF" the reference to the FSF was removed. Was there any announcement about the update of the mission statement itself? On what basis does the Steering Committee change the mission statement?
Re: GCC Mission Statement
On 6/9/21 10:13 AM, Valentino Giudice via Gcc wrote: Hi. The Mission Statement of the GCC project recently changed without any announcement. Well there was an announcement; the changes in the mission statement reflect the new reality introduced by that announcement: https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html Siddhesh
[Bug c++/100983] New: Deduction guide for member template class rejected at class scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100983 Bug ID: 100983 Summary: Deduction guide for member template class rejected at class scope Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: brycelelbach at gmail dot com Target Milestone: --- ``` struct X { template struct Y { template Y(Ts...) {} }; template Y(Ts...) -> Y; }; ``` I'm fairly confident this is legal code, but GCC rejects it, stating that a deduction guide is only allowed at namespace scope. http://eel.is/c++draft/temp.deduct.guide#3.sentence-4 says: "A deduction-guide shall inhabit the scope to which the corresponding class template belongs and, for a member class template, have the same access." ... which suggests to me that it is allowed. https://godbolt.org/z/cWa69scjW
GCC Mission Statement
Hi. The Mission Statement of the GCC project recently changed without any announcement. I am not a contributor to GCC, merely a user. However, I'd like to understand more, especially about the transparency of the project. The GCC Steering Committee is supposed to follow the mission statement as a guide for its decision. Who changes the mission statement, and for what reason? How can a modification of the statement be guided by the mission statement? How were users and contributors informed of this? Thank you in advance for your response. Best regards. For reference: - The GCC homepage states the SC is "guided by the mission statement": https://gcc.gnu.org/ - The mission statement before the update: https://web.archive.org/web/20210331192925/https://gcc.gnu.org/gccmission.html
[Bug libstdc++/100982] New: wrong constraint in std::optional::operator=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100982 Bug ID: 100982 Summary: wrong constraint in std::optional::operator= Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: hewillk at gmail dot com Target Milestone: --- There is a typo in optional#L818: template enable_if_t<__and_v<__not_>, is_constructible<_Tp, const _Up&>, is_assignable<_Tp&, _Up>, __not_<__converts_from_optional<_Tp, _Up>>, __not_<__assigns_from_optional<_Tp, _Up>>>, optional&> operator=(const optional<_Up>& __u) It should be is_assignable<_Tp&, const _Up&>. https://godbolt.org/z/x7Gb9a5v9 #include struct U {}; struct T { explicit T(const U&); T& operator=(const U&); T& operator=(U&&) = delete; }; int main() { std::optional opt1; std::optional opt2; opt2 = opt1; }
[Bug target/100981] New: ICE in info_for_reduction, at tree-vect-loop.c:4897
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100981 Bug ID: 100981 Summary: ICE in info_for_reduction, at tree-vect-loop.c:4897 Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- Target: aarch64-linux-gnu gfortran-12.0.0-alpha20210606 snapshot (g:fed94fc9e704b0de228499495b7ca4d4c79ef76b) ICEs when compiling the following testcase w/ -march=armv8.3-a -O3 -ftree-parallelize-loops=2 -fno-signed-zeros -fno-trapping-math: complex function cdcdot(n, cx) implicit none integer :: n, i, kx complex :: cx(*) double precision :: dsdotr, dsdoti, dt1, dt3 kx = 1 do i = 1, n dt1 = real(cx(kx)) dt3 = aimag(cx(kx)) dsdotr = dsdotr + dt1 * 2 - dt3 * 2 dsdoti = dsdoti + dt1 * 2 + dt3 * 2 kx = kx + 1 end do cdcdot = cmplx(real(dsdotr), real(dsdoti)) return end function cdcdot % aarch64-linux-gnu-gfortran-12.0.0 -march=armv8.3-a -O3 -ftree-parallelize-loops=2 -fno-signed-zeros -fno-trapping-math -c xrvsc8ow.f90 during GIMPLE pass: vect xrvsc8ow.f90:9:8: 9 | do i = 1, n |^ internal compiler error: in info_for_reduction, at tree-vect-loop.c:4897 0x7c8b0d info_for_reduction(vec_info*, _stmt_vec_info*) /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-loop.c:4897 0x122d008 vectorizable_live_operation(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*, int, bool, vec*) /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-loop.c:8547 0x11ed1d7 can_vectorize_live_stmts /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-stmts.c:10619 0x1216858 vect_transform_stmt(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*) /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-stmts.c:11003 0x124b296 vect_schedule_slp_node /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-slp.c:6302 0x12596cc vect_schedule_scc /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-slp.c:6516 0x125a71f vect_schedule_slp(vec_info*, vec<_slp_instance*, va_heap, vl_ptr>) /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-slp.c:6580 0x1236e7c vect_transform_loop(_loop_vec_info*, gimple*) /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vect-loop.c:9538 0x1265f0f try_vectorize_loop_1 /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vectorizer.c:1104 0x1266ca0 vectorize_loops() /var/tmp/portage/cross-aarch64-linux-gnu/gcc-12.0.0_alpha20210606/work/gcc-12-20210606/gcc/tree-vectorizer.c:1243
Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]
On 2021/6/9 04:11, Segher Boessenkool wrote: > Hi! > > On Fri, Jun 04, 2021 at 09:40:58AM +0800, Xionghu Luo wrote: Combine still fail to merge the two instructions: Trying 6 -> 7: 6: r120:KF#0=r125:KF#0<-<0x40 REG_DEAD r125:KF 7: [sfp:DI+r123:DI]=r120:KF#0<-<0x40 REG_DEAD r120:KF Successfully matched this instruction: (set (mem/c:V1TI (plus:DI (reg/f:DI 110 sfp) (reg:DI 123)) [1 S16 A128]) (subreg:V1TI (reg:KF 125) 0)) rejecting combination of insns 6 and 7 original costs 4 + 4 = 8 replacement cost 12 >>> >>> So what instructions were these? Why did the store cost 4 but the new >>> one costs 12? > > The *vsx_le_perm_store_ instruction has the *preferred* > alternative with cost 12, while the other alternative has cost 8. Why > is that? That looks like a bug. > (set_attr "length" "12,8") 12 was introduced by Mike's commit c477a6674364(r6-2577), and all the 5 vsx_le_perm_store_ are set to 12 for modes VSX_D/VSX_W/V8HI/V16QI /VSX_LE_128, I guess it is split to two rs6000_emit_le_vsx_permute before reload, but 3 rs6000_emit_le_vsx_permute after reload, so the length is 12, then it seems also not reasonable to change it from 12 to 8? And I am not sure when the alternative 1 will be chosen? vsx.md: ;; The post-reload split requires that we re-permute the source ;; register in case it is still live. (define_split [(set (match_operand:VSX_LE_128 0 "memory_operand") (match_operand:VSX_LE_128 1 "vsx_register_operand"))] "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR && !altivec_indexed_or_indirect_operand (operands[0], mode)" [(const_int 0)] { rs6000_emit_le_vsx_permute (operands[1], operands[1], mode); rs6000_emit_le_vsx_permute (operands[0], operands[1], mode); rs6000_emit_le_vsx_permute (operands[1], operands[1], mode); DONE; }) > By hacking the vsx_le_perm_store_v1ti INSN_COST from 12 to 8, >>> >>> It should be the same cost as the other store! >> >> vsx_le_permute_v1ti's cost is defined to 4 in vsx.md: > > Yes. Why is alternative 0 of *vsx_le_perm_store_ said to have a > length of 3 insns? > > > Segher > -- Thanks, Xionghu
[Bug gcov-profile/100980] New: [GCOV]The assignment statement in the “for” structure caused the wrong coverage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100980 Bug ID: 100980 Summary: [GCOV]The assignment statement in the “for” structure caused the wrong coverage Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: njuwy at smail dot nju.edu.cn CC: marxin at gcc dot gnu.org Target Milestone: --- $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure -enable-checking=release -enable-languages=c,c++ -disable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 10.2.0 (GCC) $ cat test.c #include extern void abort(void); extern void exit(int); int main(void) { struct foo{ int i0; }; int b,c,d=1; for ((b = sizeof(struct foo { int i0; int i1; })); d; d--) c = sizeof(struct foo); } $ gcc -O0 --coverage test.c;./a.out;gcov test;cat test.c.gcov File 'test.c' Lines executed:100.00% of 5 Creating 'test.c.gcov' -:0:Source:test.c -:0:Graph:test.gcno -:0:Data:test.gcda -:0:Runs:1 -:1:#include -:2:extern void abort(void); -:3:extern void exit(int); 1:4:int main(void) { -:5: struct foo{ -:6: int i0; -:7:}; 1:8: int b,c,d=1; 2:9: for ((b = sizeof(struct foo { -: 10: int i0; -: 11: int i1; -: 12:})); 1: 13: d; d--) 1: 14:c = sizeof(struct foo); -: 15:} line 9 was wrongly marked as executed 2 times
Re: [PATCH v2] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]
On 2021/6/9 05:07, Segher Boessenkool wrote: > Hi! > > On Tue, Jun 08, 2021 at 09:11:33AM +0800, Xionghu Luo wrote: >> On P8LE, extra rot64+rot64 load or store instructions are generated >> in float128 to vector __int128 conversion. >> >> This patch teaches pass swaps to also handle such pattens to remove >> extra swap instructions. > >> +/* Return 1 iff PAT is a rotate 64 bit expression; else return 0. */ >> + >> +static bool >> +pattern_is_rotate64_p (rtx pat) > > You already have a verb in the name, don't use _p please (and preferably > just don't use it at all, "pattern_is_rotate64" is much better than > "pattern_rotate64_p"). > >> +{ >> + rtx rot = SET_SRC (pat); > > So this is assuming PAT is a SINGLE_SET. Please say that in the > function comment. > > /* Return 1 iff PAT (a SINGLE_SET) is a rotate 64 bit expression; else > return 0. */ > > You can do an assert for that as well, but I wouldn't bother. > >> @@ -266,6 +280,9 @@ insn_is_load_p (rtx insn) > > (I do realise you just copied existing naming, don't worry :-) ) > >> @@ -392,7 +411,8 @@ quad_aligned_load_p (swap_web_entry *insn_entry, >> rtx_insn *insn) >>false. */ >> rtx body = PATTERN (def_insn); >> if (GET_CODE (body) != SET >> - || GET_CODE (SET_SRC (body)) != VEC_SELECT >> + || !(GET_CODE (SET_SRC (body)) == VEC_SELECT >> + || pattern_is_rotate64_p (body)) > > Broken indentation: the || should align with "pattern...". > >> @@ -2223,9 +2246,9 @@ static void >> recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete) >> { >> rtx body = PATTERN (insn); >> - gcc_assert (GET_CODE (body) == SET >> - && MEM_P (SET_DEST (body)) >> - && GET_CODE (SET_SRC (body)) == VEC_SELECT); >> + gcc_assert (GET_CODE (body) == SET && MEM_P (SET_DEST (body)) >> + && (GET_CODE (SET_SRC (body)) == VEC_SELECT >> + || pattern_is_rotate64_p (body))); > > Please start a new line for every "&&" here. The way it was was more > readable. > > It often is nice to keep things one one line, if it fits on one line. > If it does not, make a new line for every phrase. This is more readable > because you can then just scan down the line of "&&" and see the start > of every phrase without actually having to read it all. > >> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-call.c >> b/gcc/testsuite/gcc.target/powerpc/float128-call.c >> index 5895416e985..a1f09df8a57 100644 >> --- a/gcc/testsuite/gcc.target/powerpc/float128-call.c >> +++ b/gcc/testsuite/gcc.target/powerpc/float128-call.c >> @@ -21,5 +21,5 @@ >> TYPE one (void) { return ONE; } >> void store (TYPE a, TYPE *p) { *p = a; } >> >> -/* { dg-final { scan-assembler "lxvd2x 34" } } */ >> -/* { dg-final { scan-assembler "stxvd2x 34" } } */ >> +/* { dg-final { scan-assembler "lvx 2" } } */ >> +/* { dg-final { scan-assembler "stvx 2" } } */ > > Huh. Is that correct? Where did the other 32 loads and stores go? Are > there now other insns generated that you should scan for? This is expected change. lxvd2x+xxpermdi is replaced by lvx. No need scan other instructions. Similarly for stvx. 34 and 2 are *vector register names* instead of counts. diff float128-call.trunk.s float128-call.patched.s 18,19c18 < lxvd2x 34,0,9 < xxpermdi 34,34,34,2 --- > lvx 2,0,9 33,34c32 < xxpermdi 34,34,34,2 < stxvd2x 34,0,5 --- > stvx 2,0,5 Thanks for all the other comments, updated and committed with r12-1316. BR, Xionghu
PING^3 [PATCH/RFC] combine: Tweak the condition of last_set invalidation
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html BR, Kewen on 2021/5/26 上午11:04, Kewen.Lin via Gcc-patches wrote: > Hi, > > Gentle ping this: > > https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html > > BR, > Kewen > > on 2021/5/7 上午10:45, Kewen.Lin via Gcc-patches wrote: >> Hi Segher, >> I think this should be postponed to stage 1 though? Or is there anything very urgent in it? >>> >>> Yeah, I agree that this belongs to stage1, and there isn't anything >>> urgent about it. Thanks for all further comments above! >>> >> >> Gentle ping this: >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html >> >> BR, >> Kewen >>
PING^2 [PATCH] rs6000: Support more short/char to float conversion
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569792.html BR, Kewen on 2021/5/26 上午11:02, Kewen.Lin via Gcc-patches wrote: > Hi, > > Gentle ping this: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569792.html > > > BR, > Kewen > > on 2021/5/7 上午10:30, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> For some cases that when we load unsigned char/short values from >> the appropriate unsigned char/short memories and convert them to >> double/single precision floating point value, there would be >> implicit conversions to int first. It makes GCC not leverage the >> P9 instructions lxsibzx/lxsihzx. This patch is to add the related >> define_insn_and_split to support this kind of scenario. >> >> Bootstrapped/regtested on powerpc64le-linux-gnu P9 and >> powerpc64-linux-gnu P8. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> -- >> gcc/ChangeLog: >> >> * config/rs6000/rs6000.md >> (floatsi2_lfiwax__mem_zext): New >> define_insn_and_split. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/powerpc/p9-fpcvt-3.c: New test. >> >
PING^1 [PATCH v2] rs6000: Add load density heuristic
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571258.html BR, Kewen on 2021/5/26 上午10:59, Kewen.Lin via Gcc-patches wrote: > Hi, > > This is the updated version of patch to deal with the bwaves_r > degradation due to vector construction fed by strided loads. > > As Richi's comments [1], this follows the similar idea to over > price the vector construction fed by VMAT_ELEMENTWISE or > VMAT_STRIDED_SLP. Instead of adding the extra cost on vector > construction costing immediately, it firstly records how many > loads and vectorized statements in the given loop, later in > rs6000_density_test (called by finish_cost) it computes the > load density ratio against all vectorized stmts, and check > with the corresponding thresholds DENSITY_LOAD_NUM_THRESHOLD > and DENSITY_LOAD_PCT_THRESHOLD, do the actual extra pricing > if both thresholds are exceeded. > > Note that this new load density heuristic check is based on > some fields in target cost which are updated as needed when > scanning each add_stmt_cost entry, it's independent of the > current function rs6000_density_test which requires to scan > non_vect stmts. Since it's checking the load stmts count > vs. all vectorized stmts, it's kind of density, so I put > it in function rs6000_density_test. With the same reason to > keep it independent, I didn't put it as an else arm of the > current existing density threshold check hunk or before this > hunk. > > In the investigation of -1.04% degradation from 526.blender_r > on Power8, I noticed that the extra penalized cost 320 on one > single vector construction with type V16QI is much exaggerated, > which makes the final body cost unreliable, so this patch adds > one maximum bound for the extra penalized cost for each vector > construction statement. > > Bootstrapped/regtested on powerpc64le-linux-gnu P9. > > Full SPEC2017 performance evaluation on Power8/Power9 with > option combinations: > * -O2 -ftree-vectorize {,-fvect-cost-model=very-cheap} {,-ffast-math} > * {-O3, -Ofast} {,-funroll-loops} > > bwaves_r degradations on P8/P9 have been fixed, nothing else > remarkable was observed. > > Is it ok for trunk? > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570076.html > > BR, > Kewen > - > gcc/ChangeLog: > > * config/rs6000/rs6000.c (struct rs6000_cost_data): New members > nstmts, nloads and extra_ctor_cost. > (rs6000_density_test): Add load density related heuristics and the > checks, do extra costing on vector construction statements if need. > (rs6000_init_cost): Init new members. > (rs6000_update_target_cost_per_stmt): New function. > (rs6000_add_stmt_cost): Factor vect_nonmem hunk out to function > rs6000_update_target_cost_per_stmt and call it. >
[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #11 from Kewen Lin --- Fixed on trunk, will continue to refactor the tree_predictive_commoning_loop and its callees into class and member functions as suggested.
[Bug tree-optimization/100925] [12 Regression] tree check fail in make_range_step with -O1 in reassoc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100925 Andrew Pinski changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2021-June/57 ||2317.html Keywords||patch --- Comment #7 from Andrew Pinski --- Patch submitted: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572317.html
[PATCH 2/2] Disallow pointer and offset types on some gimple
From: Andrew Pinski While debugging PR 100925, I found that the gimple verifiers don't reject NEGATE on pointer or offset type. This patch adds the check on some unary and binary gimple which should not have operated on pointer/offset types. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Thanks, Andrew Pinski gcc/ChangeLog: * tree-cfg.c (verify_gimple_assign_unary): Reject point and offset types on NEGATE_EXPR, ABS_EXPR, BIT_NOT_EXPR, PAREN_EXPR and CNONJ_EXPR. (verify_gimple_assign_binary): Reject point and offset types on MULT_EXPR, MULT_HIGHPART_EXPR, TRUNC_DIV_EXPR, CEIL_DIV_EXPR, FLOOR_DIV_EXPR, ROUND_DIV_EXPR, TRUNC_MOD_EXPR, CEIL_MOD_EXPR, FLOOR_MOD_EXPR, ROUND_MOD_EXPR, RDIV_EXPR, and EXACT_DIV_EXPR. --- gcc/tree-cfg.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 02256580c98..90fe4775405 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3752,6 +3752,15 @@ verify_gimple_assign_unary (gassign *stmt) case BIT_NOT_EXPR: case PAREN_EXPR: case CONJ_EXPR: + /* Disallow pointer and offset types for many of the unary gimple. */ + if (POINTER_TYPE_P (lhs_type) + || TREE_CODE (lhs_type) == OFFSET_TYPE) + { + error ("invalid types for %qs", code_name); + debug_generic_expr (lhs_type); + debug_generic_expr (rhs1_type); + return true; + } break; case ABSU_EXPR: @@ -4127,6 +4136,19 @@ verify_gimple_assign_binary (gassign *stmt) case ROUND_MOD_EXPR: case RDIV_EXPR: case EXACT_DIV_EXPR: + /* Disallow pointer and offset types for many of the binary gimple. */ + if (POINTER_TYPE_P (lhs_type) + || TREE_CODE (lhs_type) == OFFSET_TYPE) + { + error ("invalid types for %qs", code_name); + debug_generic_expr (lhs_type); + debug_generic_expr (rhs1_type); + debug_generic_expr (rhs2_type); + return true; + } + /* Continue with generic binary expression handling. */ + break; + case MIN_EXPR: case MAX_EXPR: case BIT_IOR_EXPR: -- 2.27.0
[PATCH 1/2] Fix PR 100925: Limit some a?CST1:CST2 optimizations to intergal types only
From: Andrew Pinski The problem here is with offset (and pointer) types is we produce a negative expression when this optimization hits. It is easier to disable this optimization for all non-integeral types instead of finding an integer type which is the same precission as the type to do the negative expression on it. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/100925 * match.pd (a ? CST1 : CST2): Limit transformations that would produce a negative to integeral types only. Change !POINTER_TYPE_P to INTEGRAL_TYPE_P also. gcc/testsuite/ChangeLog: * g++.dg/torture/pr100925.C: New test. --- gcc/match.pd| 8 gcc/testsuite/g++.dg/torture/pr100925.C | 24 2 files changed, 28 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/torture/pr100925.C diff --git a/gcc/match.pd b/gcc/match.pd index d06ff170684..bf22bc3a198 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3733,10 +3733,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (integer_onep (@1)) (convert (convert:boolean_type_node @0))) /* a ? -1 : 0 -> -a. */ -(if (integer_all_onesp (@1)) +(if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1)) (negate (convert (convert:boolean_type_node @0 /* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */ -(if (!POINTER_TYPE_P (type) && integer_pow2p (@1)) +(if (INTEGRAL_TYPE_P (type) && integer_pow2p (@1)) (with { tree shift = build_int_cst (integer_type_node, tree_log2 (@1)); } @@ -3750,10 +3750,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (integer_onep (@2)) (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } ))) /* a ? -1 : 0 -> -(!a). */ - (if (integer_all_onesp (@2)) + (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2)) (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */ - (if (!POINTER_TYPE_P (type) && integer_pow2p (@2)) + (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@2)) (with { tree shift = build_int_cst (integer_type_node, tree_log2 (@2)); } diff --git a/gcc/testsuite/g++.dg/torture/pr100925.C b/gcc/testsuite/g++.dg/torture/pr100925.C new file mode 100644 index 000..de13950dca0 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr100925.C @@ -0,0 +1,24 @@ +// { dg-do compile } + +struct QScopedPointerDeleter { + static void cleanup(int *); +}; +class QScopedPointer { + typedef int *QScopedPointer::*RestrictedBool; + +public: + operator RestrictedBool() { return d ? nullptr : ::d; } + void reset() { +if (d) + QScopedPointerDeleter::cleanup(d); + } + int *d; +}; +class DOpenGLPaintDevicePrivate { +public: + QScopedPointer fbo; +} DOpenGLPaintDeviceresize_d; +void DOpenGLPaintDeviceresize() { + if (DOpenGLPaintDeviceresize_d.fbo) +DOpenGLPaintDeviceresize_d.fbo.reset(); +} -- 2.27.0
[PATCH] Improvements to fur_source interface class, enhanced stmt folding options.
I recently introduced the fur_source class as an intermediary between the Fold_Using_Ranges (FUR) class and where to pick up any ssa_names that it needs. The initial idea was to abstract out a set of frequently changing parameters so the client fold routines wouldn't have to change every time we added a new way to do something with a statement. Its also used by gori_compute when unwinding to allow for access to non-range-ops stmts when processing. That said, I hadn't really formalized it, so fold_using_ranges was accessing its members frequently. We have encountered an opportunity to add something else which is useful,. but where the internals should be hidden. This patch a) formalizes the API (hiding the internals) b) virtualizes the functions so we can use inheritance and not use conditions, and c) adds the ability to pick up operands from a vector or list of ranges. There is no real visual difference to consumers since its an interface layer they don't normally see. The net effect is now there are multiple versions of fold_stmt that all behave quite nicely: bool fold_range (irange , gimple *s, range_query *q = NULL); bool fold_range (irange , gimple *s, edge on_edge, range_query *q = NULL); bool fold_range (irange , gimple *s, irange ); bool fold_range (irange , gimple *s, irange , irange ); bool fold_range (irange , gimple *s, unsigned num_elements, irange *vector); Now we can calculate ranges for a stmt,, ask for its range to be calculated as if it were on an edge, or we can supply one or more ranges to it to have the fold performed. This latter set is akin to the old gimple_range_fold() routine we had, expect it only worked on a range-ops stmt, whereas this will work on any kind of stmt, including a PHI node. The routines have all been enhanced so that if a range_query is not provided, it will invoke the default range_query. It will also invoke the default query if a list of ranges is supplied and it requires additional ranges to resolve the stmt being queried. There will probably be some additional tweaks going forward, especially since the list routines haven't really been tested. Aldy will be using them shortly, so that will be the test bed :-) Performance is basically a wash since there is a slight overhead for the virtual function calls, but it is offset by no longer have to do any conditional checks in the get_operand() routine. Bootstraps on x86_64-pc-linux-gnu with no regressions. Pushed. Andrew commit 87f9ac937d6cfd81cbbe0a43518ba10781888d7c Author: Andrew MacLeod Date: Tue Jun 8 15:43:03 2021 -0400 Virtualize fur_source and turn it into a proper API. No more accessing the local info. Also add fur_source/fold_stmt where ranges are provided via being specified, or a vector to replace gimple_fold_range. * gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Use a fur_stmt source record. * gimple-range.cc (fur_source::get_operand): Generic range query. (fur_source::get_phi_operand): New. (fur_source::register_dependency): New. (fur_source::query): New. (class fur_edge): New. Edge source for operands. (fur_edge::fur_edge): New. (fur_edge::get_operand): New. (fur_edge::get_phi_operand): New. (fur_edge::query): New. (fur_stmt::fur_stmt): New. (fur_stmt::get_operand): New. (fur_stmt::get_phi_operand): New. (fur_stmt::query): New. (class fur_depend): New. Statement source and process dependencies. (fur_depend::fur_depend): New. (fur_depend::register_dependency): New. (class fur_list): New. List source for operands. (fur_list::fur_list): New. (fur_list::get_operand): New. (fur_list::get_phi_operand): New. (fold_range): New. Instantiate appropriate fur_source class and fold. (fold_using_range::range_of_range_op): Use new API. (fold_using_range::range_of_address): Ditto. (fold_using_range::range_of_phi): Ditto. (imple_ranger::fold_range_internal): Use fur_depend class. (fold_using_range::range_of_ssa_name_with_loop_info): Use new API. * gimple-range.h (class fur_source): Now a base class. (class fur_stmt): New. (fold_range): New prototypes. (fur_source::fur_source): Delete. diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc index 2c5360690db..09dcd694319 100644 --- a/gcc/gimple-range-gori.cc +++ b/gcc/gimple-range-gori.cc @@ -1008,7 +1008,7 @@ gori_compute::outgoing_edge_range_p (irange , edge e, tree name, if (!stmt) return false; - fur_source src (, NULL, e, stmt); + fur_stmt src (stmt, ); // If NAME can be calculated on the edge, use that. if (is_export_p (name, e->src)) diff --git
[Bug c++/100796] [11 Regression] GCC does not honor #pragma diagnostic ignored when using the integrated preprocessor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100796 Jason Merrill changed: What|Removed |Added CC||jason at gcc dot gnu.org --- Comment #3 from Jason Merrill --- (In reply to Giuseppe D'Angelo from comment #0) > Please advise as of what kind of test I could do > / provide to help you track this one down. The testcase doesn't really need to be reduced, just separated from the Qt build system: some source files and a compiler command line would be fine.
[PATCH] Use range based loops to iterate over vec<> in various places
Hello, This makes things a good bit shorter, and reduces complexity by removing a bunch of index variables. bootstrapped and regtested on x86_64-linux-gnu, ok? Trev gcc/analyzer/ChangeLog: * call-string.cc (call_string::call_string): Iterate over vec<> with range based for. (call_string::operator=): Likewise. (call_string::to_json): Likewise. (call_string::hash): Likewise. (call_string::calc_recursion_depth): Likewise. * checker-path.cc (checker_path::fixup_locations): Likewise. * constraint-manager.cc (equiv_class::equiv_class): Likewise. (equiv_class::to_json): Likewise. (equiv_class::hash): Likewise. (constraint_manager::constraint_manager): Likewise. (constraint_manager::operator=): Likewise. (constraint_manager::hash): Likewise. (constraint_manager::to_json): Likewise. (constraint_manager::add_unknown_constraint): Likewise. * engine.cc (impl_region_model_context::on_svalue_leak): Likewise. (on_liveness_change): Likewise. (impl_region_model_context::on_unknown_change): Likewise. * program-state.cc (extrinsic_state::to_json): Likewise. (sm_state_map::set_state): Likewise. * region-model.cc (make_test_compound_type): Likewise. (test_canonicalization_4): Likewise. gcc/ChangeLog: * auto-profile.c (afdo_find_equiv_class): Iterate over vec<> with range based for. * cgraphclones.c (cgraph_node::create_clone): Likewise. (cgraph_node::create_version_clone): Likewise. * dwarf2out.c (output_call_frame_info): Likewise. * gcc.c (do_specs_vec): Likewise. (do_spec_1): Likewise. (driver::set_up_specs): Likewise. * gimple-loop-jam.c (any_access_function_variant_p): Likewise. * ifcvt.c (cond_move_process_if_block): Likewise. * ipa-modref.c (modref_lattice::add_escape_point): Likewise. (analyze_parms): Likewise. (modref_write_escape_summary): Likewise. (update_escape_summary_1): Likewise. * ipa-prop.h (ipa_copy_agg_values): Likewise. (ipa_release_agg_values): Likewise. * lower-subreg.c (decompose_multiword_subregs): Likewise. * lto-streamer-out.c (DFS::DFS_write_tree_body): Likewise. (hash_tree): Likewise. (prune_offload_funcs): Likewise. * sel-sched-dump.c (dump_insn_vector): Likewise. * timevar.c (timer::named_items::print): Likewise. * tree-cfgcleanup.c (cleanup_control_flow_pre): Likewise. (cleanup_tree_cfg_noloop): Likewise. * tree-data-ref.c (dump_data_references): Likewise. (print_dir_vectors): Likewise. (print_dist_vectors): Likewise. (dump_data_dependence_relation): Likewise. (dump_data_dependence_relations): Likewise. (dump_dist_dir_vectors): Likewise. (dump_ddrs): Likewise. (prune_runtime_alias_test_list): Likewise. (create_runtime_alias_checks): Likewise. (free_subscripts): Likewise. (save_dist_v): Likewise. (save_dir_v): Likewise. (invariant_access_functions): Likewise. (same_access_functions): Likewise. (access_functions_are_affine_or_constant_p): Likewise. (compute_all_dependences): Likewise. (find_data_references_in_stmt): Likewise. (graphite_find_data_references_in_stmt): Likewise. (free_dependence_relations): Likewise. (free_data_refs): Likewise. * tree-into-ssa.c (dump_currdefs): Likewise. (rewrite_update_phi_arguments): Likewise. * tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise. * tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise. * tree-ssa-structalias.c (constraint_set_union): Likewise. (merge_node_constraints): Likewise. (move_complex_constraints): Likewise. (do_deref): Likewise. (get_constraint_for_address_of): Likewise. (get_constraint_for_1): Likewise. (process_all_all_constraints): Likewise. (make_constraints_to): Likewise. (handle_rhs_call): Likewise. * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr): Likewise. (vect_slp_analyze_node_dependences): Likewise. (vect_slp_analyze_instance_dependence): Likewise. (vect_record_base_alignments): Likewise. (vect_get_peeling_costs_all_drs): Likewise. (vect_peeling_supportable): Likewise. * tree-vectorizer.c (vec_info::~vec_info): Likewise. (vec_info::free_stmt_vec_infos): Likewise. gcc/c/ChangeLog: * c-parser.c (c_parser_translation_unit): Iterate over vec<> with range based for. (c_parser_postfix_expression): Likewise. gcc/cp/ChangeLog: * constexpr.c (cxx_eval_call_expression): Iterate over vec<> with range based for. (cxx_eval_store_expression): Likewise.
[Bug c++/100838] [11 Regression] -fno-elide-constructors for C++14 missing required destructor call since r11-5872-g4eb28483004f8291
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100838 Jason Merrill changed: What|Removed |Added Summary|[11/12 Regression] |[11 Regression] |-fno-elide-constructors for |-fno-elide-constructors for |C++14 missing required |C++14 missing required |destructor call since |destructor call since |r11-5872-g4eb28483004f8291 |r11-5872-g4eb28483004f8291 --- Comment #4 from Jason Merrill --- Fixed for 12 so far.
[Bug c++/89062] class template argument deduction failure with parentheses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89062 Marek Polacek changed: What|Removed |Added CC||brycelelbach at gmail dot com --- Comment #7 from Marek Polacek --- *** Bug 100979 has been marked as a duplicate of this bug. ***
[Bug c++/100979] Nested CTAD fails when the outer object is direct initialized and the inner object is list initialized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100979 Marek Polacek changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED CC||mpolacek at gcc dot gnu.org --- Comment #1 from Marek Polacek --- I think it's a dup. *** This bug has been marked as a duplicate of bug 89062 ***
[Bug c++/100879] [10/11/12 Regression] gcc is complaining of a signed compare when comparing enums of different types (same underlying type)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100879 Jason Merrill changed: What|Removed |Added Target Milestone|10.4|12.0 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #2 from Jason Merrill --- Fixed for GCC 12, thanks. That the warning used -Wsign-compare seems to be because it was associated with that option before -Wenum-compare was added, and never updated perhaps because it was dead code for a long time.
[Bug c++/100879] [10/11/12 Regression] gcc is complaining of a signed compare when comparing enums of different types (same underlying type)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100879 --- Comment #1 from CVS Commits --- The master branch has been updated by Jason Merrill : https://gcc.gnu.org/g:087253b9951766cbd93286b804ebb1ab59197aa8 commit r12-1314-g087253b9951766cbd93286b804ebb1ab59197aa8 Author: Jason Merrill Date: Tue Jun 8 17:48:49 2021 -0400 c++: remove redundant warning [PR100879] Before my r277864, build_new_op promoted enums to int before passing them on to cp_build_binary_op; after that commit, it doesn't, so warn_for_sign_compare sees the enum operands and gives a redundant warning. This warning dates back to 1995, and seems to have been dead code for a long time--likely since build_new_op was added in 1997--so let's just remove it. PR c++/100879 gcc/c-family/ChangeLog: * c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch warning. gcc/testsuite/ChangeLog: * g++.dg/diagnostic/enum3.C: New test.
[pushed] c++: remove redundant warning [PR100879]
Before my r277864, build_new_op promoted enums to int before passing them on to cp_build_binary_op; after that commit, it doesn't, so warn_for_sign_compare sees the enum operands and gives a redundant warning. This warning dates back to 1995, and seems to have been dead code for a long time--likely since build_new_op was added in 1997--so let's just remove it. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/100879 gcc/c-family/ChangeLog: * c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch warning. gcc/testsuite/ChangeLog: * g++.dg/diagnostic/enum3.C: New test. --- gcc/c-family/c-warn.c | 12 gcc/testsuite/g++.dg/diagnostic/enum3.C | 9 + 2 files changed, 9 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/g++.dg/diagnostic/enum3.C diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c index a587b993fde..cd3c99ef4df 100644 --- a/gcc/c-family/c-warn.c +++ b/gcc/c-family/c-warn.c @@ -2240,18 +2240,6 @@ warn_for_sign_compare (location_t location, int op1_signed = !TYPE_UNSIGNED (TREE_TYPE (orig_op1)); int unsignedp0, unsignedp1; - /* In C++, check for comparison of different enum types. */ - if (c_dialect_cxx() - && TREE_CODE (TREE_TYPE (orig_op0)) == ENUMERAL_TYPE - && TREE_CODE (TREE_TYPE (orig_op1)) == ENUMERAL_TYPE - && TYPE_MAIN_VARIANT (TREE_TYPE (orig_op0)) -!= TYPE_MAIN_VARIANT (TREE_TYPE (orig_op1))) -{ - warning_at (location, - OPT_Wsign_compare, "comparison between types %qT and %qT", - TREE_TYPE (orig_op0), TREE_TYPE (orig_op1)); -} - /* Do not warn if the comparison is being done in a signed type, since the signed type will only be chosen if it can represent all the values of the unsigned type. */ diff --git a/gcc/testsuite/g++.dg/diagnostic/enum3.C b/gcc/testsuite/g++.dg/diagnostic/enum3.C new file mode 100644 index 000..d51aa8a0f70 --- /dev/null +++ b/gcc/testsuite/g++.dg/diagnostic/enum3.C @@ -0,0 +1,9 @@ +// PR c++/100879 +// { dg-additional-options -Werror=sign-compare } + +enum e1 { e1val }; +enum e2 { e3val }; + +int main( int, char * [] ) { + if ( e1val == e3val ) return 1; // { dg-warning -Wenum-compare } +} base-commit: 61fc01806f376a780978a6dea165ec3dadef085b -- 2.27.0
[PATCH] c++: Failure to delay noexcept parsing with ptr-operator [PR100752]
We weren't passing 'flags' to the recursive call to cp_parser_declarator in the ptr-operator case and as an effect, delayed parsing of noexcept didn't work as advertised. The following change passes more than just CP_PARSER_FLAGS_DELAY_NOEXCEPT but that doesn't seem to break anything. I'm not passing member_p because I don't need it and because it breaks a few tests. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/branches? PR c++/100752 gcc/cp/ChangeLog: * parser.c (cp_parser_declarator): Pass flags down to cp_parser_declarator. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/noexcept69.C: New test. --- gcc/cp/parser.c | 3 +-- gcc/testsuite/g++.dg/cpp0x/noexcept69.C | 12 2 files changed, 13 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept69.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index d59a829d0b9..5930990ec1c 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -22066,8 +22066,7 @@ cp_parser_declarator (cp_parser* parser, cp_parser_parse_tentatively (parser); /* Parse the dependent declarator. */ - declarator = cp_parser_declarator (parser, dcl_kind, -CP_PARSER_FLAGS_NONE, + declarator = cp_parser_declarator (parser, dcl_kind, flags, /*ctor_dtor_or_conv_p=*/NULL, /*parenthesized_p=*/NULL, /*member_p=*/false, diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept69.C b/gcc/testsuite/g++.dg/cpp0x/noexcept69.C new file mode 100644 index 000..9b87ba0cafb --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/noexcept69.C @@ -0,0 +1,12 @@ +// PR c++/100752 +// { dg-do compile { target c++11 } } + +struct S { + void f() noexcept {} + S () noexcept(noexcept(f())) { f(); return *this; } +}; + +struct X { + int& f() noexcept(noexcept(i)); + int i; +}; base-commit: c4574d23cb07340918793a5a98ae7bb2988b3791 -- 2.31.1
[PATCH 3/3] Add IEEE 128-bit fp conditional move on PowerPC.
[PATCH 3/3] Add IEEE 128-bit fp conditional move on PowerPC. This patch adds the support for power10 IEEE 128-bit floating point conditional move and for automatically generating min/max. In this patch, I simplified things compared to previous patches. Instead of allowing any four of the modes to be used for the conditional move comparison and the move itself could use different modes, I restricted the conditional move to just the same mode. I.e. you can do: _Float128 a, b, c, d, e, r; r = (a == b) ? c : d; But you can't do: _Float128 c, d, r; double a, b; r = (a == b) ? c : d; or: _Float128 a, b; double c, d, r; r = (a == b) ? c : d; This eliminates a lot of the complexity of the code, because you don't have to worry about the sizes being different, and the IEEE 128-bit types being restricted to Altivec registers, while the SF/DF modes can use any VSX register. I did not modify the existing support that allowed conditional moves where SFmode operands are compared and DFmode operands are moved (and vice versa). Compared to the May 18th patches, this patch replaces the complicated test that was complained about. I tested it on 3 platforms: * Power9 little endian, --with-code=power9; * Power8 big endian, --with-code=power8, both 32/64-bit tests done; * Power10 little endian, --with-code=power10. All systems bootstrapped and there were no new regressions. I believe I have addressed the issues with the last patch. Can I check this into the master branch, and after a soak-in period, back port it to the GCC 11 branch? gcc/ 2021-06-08 Michael Meissner * config/rs6000/rs6000.c (rs6000_maybe_emit_fp_cmove): Add IEEE 128-bit floating point conditional move support. (have_compare_and_set_mask): Add IEEE 128-bit floating point types. * config/rs6000/rs6000.md (movcc, IEEE128 iterator): New insn. (movcc_p10, IEEE128 iterator): New insn. (movcc_invert_p10, IEEE128 iterator): New insn. (fpmask, IEEE128 iterator): New insn. (xxsel, IEEE128 iterator): New insn. gcc/testsuite/ 2021-06-08 Michael Meissner * gcc.target/powerpc/float128-cmove.c: New test. * gcc.target/powerpc/float128-minmax-3.c: New test. --- gcc/config/rs6000/rs6000.c| 38 ++- gcc/config/rs6000/rs6000.md | 106 ++ .../gcc.target/powerpc/float128-cmove.c | 58 ++ .../gcc.target/powerpc/float128-minmax-3.c| 15 +++ 4 files changed, 215 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 1651788df6a..411e7539019 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -15698,8 +15698,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false, return 1; } -/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or - minimum with "C" semantics. +/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a + maximum or minimum with "C" semantics. Unless you use -ffast-math, you can't use these instructions to replace conditions that implicitly reverse the condition because the comparison @@ -15775,6 +15775,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond) enum rtx_code code = GET_CODE (op); rtx op0 = XEXP (op, 0); rtx op1 = XEXP (op, 1); + machine_mode compare_mode = GET_MODE (op0); machine_mode result_mode = GET_MODE (dest); rtx compare_rtx; rtx cmove_rtx; @@ -15783,6 +15784,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond) if (!can_create_pseudo_p ()) return 0; + /* We allow the comparison to be either SFmode/DFmode and the true/false + condition to be either SFmode/DFmode. I.e. we allow: + + float a, b; + double c, d, r; + + r = (a == b) ? c : d; + +and: + + double a, b; + float c, d, r; + + r = (a == b) ? c : d; + +but we don't allow intermixing the IEEE 128-bit floating point types with +the 32/64-bit scalar types. + +It gets too messy where SFmode/DFmode can use any register and TFmode/KFmode +can only use Altivec registers. In addtion, we would need to do a XXPERMDI +if we compare SFmode/DFmode and move TFmode/KFmode. */ + + if (compare_mode == result_mode + || (compare_mode == SFmode && result_mode == DFmode) + || (compare_mode == DFmode && result_mode == SFmode)) +; + else +return false; + switch (code) { case EQ: @@ -15835,6 +15865,10 @@ have_compare_and_set_mask (machine_mode mode) case E_DFmode: return TARGET_P9_MINMAX; +case E_KFmode: +case E_TFmode: + return TARGET_POWER10 &&
[PATCH 2/3] Fix IEEE 128-bit min/max test.
[PATCH 2/3] Fix IEEE 128-bit min/max test. This patch fixes the float128-minmax.c test so that it can accommodate the generation of xsmincqp and xsmaxcqp instructions on power10. I changed the effective target from 'float128' to 'ppc_float128_hw', since this needs the IEEE 128-bit float hardware support. I tested it on 3 platforms: * Power9 little endian, --with-code=power9; * Power8 big endian, --with-code=power8, both 32/64-bit tests done; * Power10 little endian, --with-code=power10. All systems bootstrapped and there were no new regressions. I believe I have addressed the issues with the last patch. Can I check this into the master branch, and after a soak-in period, back port it to the GCC 11 branch? gcc/testsuite/ 2021-06-08 Michael Meissner * gcc.target/powerpc/float128-minmax.c: Adjust expected code for power10. * lib/target-supports.exp (check_effective_target_has_arch_pwr10): New target support. --- gcc/testsuite/gcc.target/powerpc/float128-minmax.c | 8 +--- gcc/testsuite/lib/target-supports.exp | 10 ++ 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c index fe397518f2f..a7d3a3a0b3e 100644 --- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c @@ -1,6 +1,5 @@ -/* { dg-do compile { target lp64 } } */ /* { dg-require-effective-target powerpc_p9vector_ok } */ -/* { dg-require-effective-target float128 } */ +/* { dg-require-effective-target ppc_float128_hw } */ /* { dg-options "-mpower9-vector -O2 -ffast-math" } */ #ifndef TYPE @@ -12,5 +11,8 @@ TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); } TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); } -/* { dg-final { scan-assembler {\mxscmpuqp\M} } } */ +/* Adjust code power10 which has native min/max instructions. */ +/* { dg-final { scan-assembler {\mxscmpuqp\M} { target { ! has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler {\mxsmincqp\M} { target { has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler {\mxsmaxcqp\M} { target { has_arch_pwr10 } } } } */ /* { dg-final { scan-assembler-not {\mbl\M} } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 7f78c5593ac..789723fb287 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6127,6 +6127,16 @@ proc check_effective_target_has_arch_pwr9 { } { }] } +proc check_effective_target_has_arch_pwr10 { } { + return [check_no_compiler_messages arch_pwr10 assembly { + #ifndef _ARCH_PWR10 + #error does not have power10 support. + #else + /* "has power10 support" */ + #endif + }] +} + # Return 1 if this is a PowerPC target supporting -mcpu=power10. # Limit this to 64-bit linux systems for now until other targets support # power10. -- 2.31.1 -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
[PATCH 1/3] Add IEEE 128-bit min/max support on PowerPC.
[PATCH 1/3] Add IEEE 128-bit min/max support on PowerPC. This patch adds the support for the IEEE 128-bit floating point C minimum and maximum instructions. The next patch will add the support for using the compare and set mask instruction to implement conditional moves. This patch does not try to re-use the code used for SF/DF min/max support. It defines a separate insn for the IEEE 128-bit support. It uses the code iterator to simplify adding both operations. GCC will not convert ternary operations into using min/max instructions provided in this patch unless the user uses -Ofast or similar switches due to issues with NaNs. The next patch that adds conditional move instructions will enable the ternary conversion in many cases. Note the code for fixing float128-minmax.c has been moved to a separate patch. I tested it on 3 platforms: * Power9 little endian, --with-code=power9; * Power8 big endian, --with-code=power8, both 32/64-bit tests done; * Power10 little endian, --with-code=power10. All systems bootstrapped and there were no new regressions. I believe I have addressed the issues with the last patch. Can I check this into the master branch, and after a soak-in period, back port it to the GCC 11 branch? gcc/ 2021-06-08 Michael Meissner * config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA 3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions. * config/rs6000/rs6000.md (s3, IEEE128 iterator): New insns. gcc/testsuite/ 2021-06-08 Michael Meissner * gcc.target/powerpc/float128-minmax-2.c: New test. --- gcc/config/rs6000/rs6000.c| 3 ++- gcc/config/rs6000/rs6000.md | 11 +++ .../gcc.target/powerpc/float128-minmax-2.c| 15 +++ 3 files changed, 28 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index b01bb5c8191..1651788df6a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -16103,7 +16103,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1) /* VSX/altivec have direct min/max insns. */ if ((code == SMAX || code == SMIN) && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) - || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode + || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode)) + || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode { emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1))); return; diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 3f59b544f6a..064c3a2d9d6 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -5214,6 +5214,17 @@ (define_insn "*s3_vsx" } [(set_attr "type" "fp")]) +;; Min/max for ISA 3.1 IEEE 128-bit floating point +(define_insn "s3" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (fp_minmax:IEEE128 +(match_operand:IEEE128 1 "altivec_register_operand" "v") +(match_operand:IEEE128 2 "altivec_register_operand" "v")))] + "TARGET_POWER10" + "xscqp %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + ;; The conditional move instructions allow us to perform max and min operations ;; even when we don't have the appropriate max/min instruction using the FSEL ;; instruction. diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c new file mode 100644 index 000..c71ba08c9f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c @@ -0,0 +1,15 @@ +/* { dg-require-effective-target ppc_float128_hw } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */ + +#ifndef TYPE +#define TYPE _Float128 +#endif + +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a + call. */ +TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); } +TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); } + +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */ +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */ -- 2.31.1 -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
[PATCH 0/3] Add Power10 IEEE 128-bit min, max, conditional move
This is a revision of the patches I sent on May 18th. I tested it on 3 platforms: * Power9 little endian, --with-code=power9; * Power8 big endian, --with-code=power8, both 32/64-bit tests done; * Power10 little endian, --with-code=power10. All systems bootstrapped and there were no new regressions. I believe I have addressed the issues with the last patch. The first patch in this set contains the same GCC code and new test as in the previous patch, since I don't believe there was a problem with those bits. I moved the changes for the existing test 'float128-minmax.c' to patch number two. Rather than using '#pragma GCC target' to force power9 code generation on power10, instead I used conditional scan-assembler statements to deliniate the power9 and power10 code generation. The third patch of this set fixes the complicated test that was complained about in the previous second patch. Can I check these patches into the master branch. Ideally, I think these should go into GCC 11.2 after a soak-in period. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
[Bug c++/100956] Unused variable warnings ignore "if constexpr" blocks where variables are conditionally used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100956 --- Comment #2 from Matt Bentley --- Thank you - I'm aware GCC might optimize it out (and failed to test with GCC10), at least in O2 mode, but other compilers might not, hence the code.
Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
On Mon, Jun 07, 2021 at 05:31:50PM -0500, Segher Boessenkool wrote: > On Tue, May 18, 2021 at 04:28:27PM -0400, Michael Meissner wrote: > > In this patch, I simplified things compared to previous patches. Instead of > > allowing any four of the modes to be used for the conditional move > > comparison > > and the move itself could use different modes, I restricted the conditional > > move to just the same mode. I.e. you can do: > > > > _Float128 a, b, c, d, e, r; > > > > r = (a == b) ? c : d; > > > > But you can't do: > > > > _Float128 c, d, r; > > double a, b; > > > > r = (a == b) ? c : d; > > > > or: > > > > _Float128 a, b; > > double c, d, r; > > > > r = (a == b) ? c : d; > > > > This eliminates a lot of the complexity of the code, because you don't have > > to > > worry about the sizes being different, and the IEEE 128-bit types being > > restricted to Altivec registers, while the SF/DF modes can use any VSX > > register. > > You do not have to worry about that anyway. You can just reuse the > existing rs6000_maybe_emit_fp_cmove. Or why not? The IF_THEN_ELSE we > generate there should work fine? > > > +(define_expand "movcc" > > + [(set (match_operand:IEEE128 0 "gpc_reg_operand") > > +(if_then_else:IEEE128 (match_operand 1 "comparison_operator") > > + (match_operand:IEEE128 2 "gpc_reg_operand") > > + (match_operand:IEEE128 3 "gpc_reg_operand")))] > > + "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" > > +{ > > + if (rs6000_emit_cmove (operands[0], operands[1], operands[2], > > operands[3])) > > +DONE; > > + else > > +FAIL; > > +}) > > Why is this a special pattern anyway? Why can you not do > d = cond ? x : y; > with cond any comparison, not even including any floating point > possibly? Well in theory you can certainly do this, we just need to add the necessary code to implement it. It quickly becomes an exponential cascading pattern, where you have one set of modes for the comparison and a different set of modes for the movement. I've certainly seen instances where the code has an integer comparison and then a FP move. We can do this via a SETBC type instruction, direct move, and then XXSEL. But that is beyond the scope of this patch. If you remember, the original form of this patch allowed the comparison to be SF, DF, KF, and possibly TF, along with the move. It becomes complicated when you have to consider that SF/DF comparisons only fill the upper 64 bits of the vector register with the comparison, and the IEEE 128-bit types need to be in Altivec registers. So I scaled back the patch to just allow 128-bit conditional move. I left in the existing 64/34-bit mixture because there was at least one benchmark it was used in the past. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
On Mon, Jun 07, 2021 at 03:25:06PM -0500, Segher Boessenkool wrote: > On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote: > > This patch adds the support for the IEEE 128-bit floating point C minimum > > and > > maximum instructions. > > > gcc/ > > 2021-05-18 Michael Meissner > > > > * config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA > > 3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp > > instructions. > > 3.1 fits on the previous line (it is better to not split numbers to a > new line). What is up with the weird multiple spaces? We don't align > the right border in changelogs :-) > > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c > > @@ -0,0 +1,15 @@ > > +/* { dg-require-effective-target ppc_float128_hw } */ > > +/* { dg-require-effective-target power10_ok } */ > > Is this needed? And, why is ppc_float128_hw needed? That combination > does not seem to make sense. Basically it is there to make sure that we are actually generating IEEE 128-bit instructions. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH] rs6000: Remove unspecs for vec_mrghl[bhw]
On Mon, May 24, 2021 at 04:02:13AM -0500, Xionghu Luo wrote: > vmrghb only accepts permute index {0, 16, 1, 17, 2, 18, 3, 19, 4, 20, > 5, 21, 6, 22, 7, 23} no matter for BE or LE in ISA, similarly for vmrghlb. (vmrglb) > + if (BYTES_BIG_ENDIAN) > +emit_insn ( > + gen_altivec_vmrghb_direct (operands[0], operands[1], operands[2])); > + else > +emit_insn ( > + gen_altivec_vmrglb_direct (operands[0], operands[2], operands[1])); Please don't indent like that, it doesn't match what we do elsewhere. For better or for worse (for worse imo), we use deep hanging indents. If you have to, you can do something like rtx insn; if (BYTES_BIG_ENDIAN) insn = gen_altivec_vmrghb_direct (operands[0], operands[1], operands[2]); else insn = gen_altivec_vmrglb_direct (operands[0], operands[2], operands[1]); emit_insn (insn); (this is better even, in that it has only one emit_insn), or even rtx (*fun) () = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct : gen_altivec_vmrglb_direct; if (!BYTES_BIG_ENDIAN) std::swap (operands[1], operands[2]); emit_insn (fun (operands[0], operands[1], operands[2])); Well, C++ does not allow that last example like that, sigh, so rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct : gen_altivec_vmrglb_direct; This is shorter than the other two options ;-) > +(define_insn "altivec_vmrghb_direct" >[(set (match_operand:V16QI 0 "register_operand" "=v") > +(vec_select:V16QI This should be indented one space more. >"TARGET_ALTIVEC" >"@ > - xxmrghw %x0,%x1,%x2 > - vmrghw %0,%1,%2" > + xxmrghw %x0,%x1,%x2 > + vmrghw %0,%1,%2" The original indent was correct, please restore. > - emit_insn (gen_altivec_vmrghw_direct (operands[0], ve, vo)); > + emit_insn (gen_altivec_vmrghw_direct_v4si (operands[0], ve, vo)); When you see a mode as part of a pattern name, chances are that it will be a good candidate for using parameterized names with. (But don't do that now, just keep it in mind as a nice cleanup to do). > @@ -23022,8 +23022,8 @@ altivec_expand_vec_perm_const (rtx target, rtx op0, > rtx op1, > : CODE_FOR_altivec_vmrglh_direct), >{ 0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23 } }, > { OPTION_MASK_ALTIVEC, > - (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw_direct > - : CODE_FOR_altivec_vmrglw_direct), > + (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw_direct_v4si > + : CODE_FOR_altivec_vmrglw_direct_v4si), The correct way is to align the ? and the : (or put everything on one line of course, if that fits) The parens around this are not needed btw, and are a distraction. > --- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c > @@ -317,10 +317,10 @@ int main () > /* { dg-final { scan-assembler-times "vctuxs" 2 } } */ > > /* { dg-final { scan-assembler-times "vmrghb" 4 { target be } } } */ > -/* { dg-final { scan-assembler-times "vmrghb" 5 { target le } } } */ > +/* { dg-final { scan-assembler-times "vmrghb" 6 { target le } } } */ > /* { dg-final { scan-assembler-times "vmrghh" 8 } } */ > -/* { dg-final { scan-assembler-times "xxmrghw" 8 } } */ > -/* { dg-final { scan-assembler-times "xxmrglw" 8 } } */ > +/* { dg-final { scan-assembler-times "xxmrghw" 4 } } */ > +/* { dg-final { scan-assembler-times "xxmrglw" 4 } } */ > /* { dg-final { scan-assembler-times "vmrglh" 8 } } */ > /* { dg-final { scan-assembler-times "xxlnor" 6 } } */ > /* { dg-final { scan-assembler-times {\mvpkudus\M} 1 } } */ > @@ -347,7 +347,7 @@ int main () > /* { dg-final { scan-assembler-times "vspltb" 6 } } */ > /* { dg-final { scan-assembler-times "vspltw" 0 } } */ > /* { dg-final { scan-assembler-times "vmrgow" 8 } } */ > -/* { dg-final { scan-assembler-times "vmrglb" 5 { target le } } } */ > +/* { dg-final { scan-assembler-times "vmrglb" 4 { target le } } } */ > /* { dg-final { scan-assembler-times "vmrglb" 6 { target be } } } */ > /* { dg-final { scan-assembler-times "vmrgew" 8 } } */ > /* { dg-final { scan-assembler-times "vsplth" 8 } } */ Are those changes correct? It looks like a vmrglb became a vmrghb, and that 4 each of xxmrghw and xxmrglw disappeared? Both seem wrong? Segher
[PATCH] Always enable DT_INIT_ARRAY/DT_FINI_ARRAY on Linux
DT_INIT_ARRAY/DT_FINI_ARRAY support was added to glibc by commit fcf70d4114db9ff7923f5dfeb3fea6e2d623e5c2 Author: Ulrich Drepper Date: Sat Jul 24 19:45:13 1999 + Update. 1999-07-24 Ulrich Drepper * elf/dl-fini.c: Handle DT_FINI_ARRAY. * elf/link.h (struct link_map): Remove l_init_running. Add l_runcount and l_initcount. * elf/dl-init.c: Handle DT_INIT_ARRAY. ... PR target/100896 * config.gcc (gcc_cv_initfini_array): Set to yes for Linux and GNU targets. --- gcc/config.gcc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/config.gcc b/gcc/config.gcc index 6833a6c13d9..4dc4fe0b65c 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -848,6 +848,8 @@ case ${target} in tmake_file="${tmake_file} t-glibc" target_has_targetcm=yes target_has_targetdm=yes + # Linux targets always support .init_array. + gcc_cv_initfini_array=yes ;; *-*-netbsd*) tm_p_file="${tm_p_file} netbsd-protos.h" -- 2.31.1
[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889 Thomas Rodgers changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Thomas Rodgers --- Fixed in master, backported to releases/gcc-11
[Bug tree-optimization/25290] PHI-OPT could be rewritten so that is uses match
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25290 --- Comment #22 from Andrew Pinski --- Without load/store handling, here are the following optimizations that either can move to match.pd already or need some extra work to do it: * value_replacement: need to handle !single_non_singleton_phi_for_edges case and more than one feeder statement (2 max according to the current definition) * cond_removal_in_popcount_clz_ctz_pattern: need 2 feeder statements and builtin call handling for feeder statements * two_value_replacement: recored as PR 100958, it can move already * abs_replacement: needs PROP_gimple_lswitch so we don't change if statements early enough ** I think majority of the abs handling is already in match.pd. * minmax_replacement: has some handling of comparisions which might not be in the match.pd patterns already. needs PROP_gimple_lswitch also. ** The handling of: if (a <= u) b = MAX (a, d); x = PHI needs to moved too. For the ones which cannot move * factor_out_conditional_conversion: will never move, though it needs improvement and moved already (PR 56223 and PR 13563) * spaceship_replacement: cannot move to match.pd depends on use afterwards which is not hard to deal with in a match pattern.
Re: [PATCH] [libstdc++] Remove unused hasher instance.
Tested x86_64-pc-linux-gnu, committed to master, backported to releases/gcc-11. On Fri, Jun 4, 2021 at 1:30 PM Jonathan Wakely wrote: > > > On Fri, 4 Jun 2021 at 20:54, Thomas Rodgers wrote: > >> This is a remnant of poorly executed refactoring. >> > > OK for trunk and gcc-11, thanks. > > > >> libstdc++-v3/ChangeLog: >> >> * include/std/barrier (__tree_barrier::_M_arrive): Remove >> unnecessary hasher instantiation. >> --- >> libstdc++-v3/include/std/barrier | 1 - >> 1 file changed, 1 deletion(-) >> >> diff --git a/libstdc++-v3/include/std/barrier >> b/libstdc++-v3/include/std/barrier >> index fd61fb4f9da..4210e30d1ce 100644 >> --- a/libstdc++-v3/include/std/barrier >> +++ b/libstdc++-v3/include/std/barrier >> @@ -103,7 +103,6 @@ It looks different from literature pseudocode for two >> main reasons: >>static_cast<__barrier_phase_t>(__old_phase_val >> + 2); >> >> size_t __current_expected = _M_expected; >> - std::hash __hasher; >> __current %= ((_M_expected + 1) >> 1); >> >> for (int __round = 0; ; ++__round) >> -- >> 2.26.2 >> >>
Re: [PATCH] libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889]
Tested x86_64-pc-linux-gnu, committed to master, backported to releases/gcc-11. On Tue, Jun 8, 2021 at 8:44 AM Jonathan Wakely wrote: > On Tue, 8 Jun 2021 at 01:29, Thomas Rodgers wrote: > >> This time without the repeatred [PR] in the subject line. >> >> Fixes libstdc++/100889 >> > > This should be part of the ChangeLog entry instead, preceded by PR so it > updates bugzilla, i.e. > > > >> libstdc++-v3/ChangeLog: >> > > PR libstdc++/100889 > > >> * include/bits/atomic_base.h (atomic_ref<_Tp*>::wait): >> Change parameter type from _Tp to _Tp*. >> * testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend >> coverage of types tested. >> > > > OK for trunk and gcc-11 with that change, thanks. > > > >
[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889 --- Comment #3 from CVS Commits --- The releases/gcc-11 branch has been updated by Thomas Rodgers : https://gcc.gnu.org/g:d7462945387b33744f665d1aa33ba1cec79c03b0 commit r11-8528-gd7462945387b33744f665d1aa33ba1cec79c03b0 Author: Thomas Rodgers Date: Tue Jun 8 15:51:53 2021 -0700 libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889] libstdc++-v3/ChangeLog: PR libstdc++/100889 * include/bits/atomic_base.h (atomic_ref<_Tp*>::wait): Change parameter type from _Tp to _Tp*. * testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend coverage of types tested. (cherry picked from commit 25e5ecdf82b49977e86bfaded236fb34af2705ed)
[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889 --- Comment #2 from CVS Commits --- The master branch has been updated by Thomas Rodgers : https://gcc.gnu.org/g:25e5ecdf82b49977e86bfaded236fb34af2705ed commit r12-1312-g25e5ecdf82b49977e86bfaded236fb34af2705ed Author: Thomas Rodgers Date: Tue Jun 8 15:51:53 2021 -0700 libstdc++: Fix Wrong param type in :atomic_ref<_Tp*>::wait [PR100889] libstdc++-v3/ChangeLog: PR libstdc++/100889 * include/bits/atomic_base.h (atomic_ref<_Tp*>::wait): Change parameter type from _Tp to _Tp*. * testsuite/29_atomics/atomic_ref/wait_notify.cc: Extend coverage of types tested.
[Bug tree-optimization/25290] PHI-OPT could be rewritten so that is uses match
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25290 --- Comment #21 from Andrew Pinski --- Note this is not fully fixed, there is still some more work to do to deal with the non single_non_singleton_phi_for_edges case which will allow to get rid of value_replacement. Note to get rid of early_p check and abs_replacement, we need to add PROP_gimple_lswitch to say we have lowered switches already.
[Bug c++/100979] New: Nested CTAD fails when the outer object is direct initialized and the inner object is list initialized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100979 Bug ID: 100979 Summary: Nested CTAD fails when the outer object is direct initialized and the inner object is list initialized Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: brycelelbach at gmail dot com Target Milestone: --- template struct X { X(T t) {} }; int main() { auto t00 = X(1); auto t01 = X{1}; X t02{1}; X t03(1); auto t04 = X(X{1}); auto t05 = X{X{1}}; auto t06 = X(X(1)); auto t07 = X{X(1)}; X t08(X{1}); // GCC 11.x and up rejects this; MSVC and Clang accept it. X t09{X{1}}; X t10(X(1)); X t11{X(1)}; } https://godbolt.org/z/Pbx6cjE7q
[Bug c++/100065] Conditional explicit doesn't work for deduction guide
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100065 --- Comment #3 from Marek Polacek --- Fixed on trunk so far, will backport.
[Bug c++/100065] Conditional explicit doesn't work for deduction guide
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100065 --- Comment #2 from CVS Commits --- The master branch has been updated by Marek Polacek : https://gcc.gnu.org/g:1afa4facb9348cac0349ff9c30066aa25a3608f7 commit r12-1310-g1afa4facb9348cac0349ff9c30066aa25a3608f7 Author: Marek Polacek Date: Mon Jun 7 16:06:00 2021 -0400 c++: explicit() ignored on deduction guide [PR100065] When we have explicit() with a value-dependent argument, we can't evaluate it at parsing time, so cp_parser_function_specifier_opt stashes the argument into the decl-specifiers and grokdeclarator then stores it into explicit_specifier_map, which is then used when substituting the function decl. grokdeclarator stores it for constructors and conversion functions, but we also need to do it for deduction guides, otherwise we'll forget that we've seen an explicit-specifier as in the attached test. PR c++/100065 gcc/cp/ChangeLog: * decl.c (grokdeclarator): Store a value-dependent explicit-specifier even for deduction guides. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/explicit18.C: New test.
[Bug tree-optimization/25290] PHI-OPT could be rewritten so that is uses match
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25290 --- Comment #20 from CVS Commits --- The master branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:c4574d23cb07340918793a5a98ae7bb2988b3791 commit r12-1309-gc4574d23cb07340918793a5a98ae7bb2988b3791 Author: Andrew Pinski Date: Tue Jun 1 06:48:05 2021 + Improve match_simplify_replacement in phi-opt This improves match_simplify_replace in phi-opt to handle the case where there is one cheap (non-call) preparation statement in the middle basic block similar to xor_replacement and others. This allows to remove xor_replacement which it does too. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Thanks, Andrew Pinski Changes since v1: v3 - Just minor changes to using gimple_assign_lhs instead of gimple_lhs and fixing a comment. v2 - change the check on the preparation statement to allow only assignments and no calls and only assignments that feed into the phi. gcc/ChangeLog: PR tree-optimization/25290 * tree-ssa-phiopt.c (xor_replacement): Delete. (tree_ssa_phiopt_worker): Delete use of xor_replacement. (match_simplify_replacement): Allow one cheap preparation statement that can be moved to before the if. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr96928-1.c: Fix testcase for now that ~ happens on the outside of the bit_xor.
[commited] Improve match_simplify_replacement in phi-opt
From: Andrew Pinski This improves match_simplify_replace in phi-opt to handle the case where there is one cheap (non-call) preparation statement in the middle basic block similar to xor_replacement and others. This allows to remove xor_replacement which it does too. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Committed as pre-approved. Thanks, Andrew Pinski Changes since v1: v3 - Just minor changes to using gimple_assign_lhs instead of gimple_lhs and fixing a comment. v2 - change the check on the preparation statement to allow only assignments and no calls and only assignments that feed into the phi. gcc/ChangeLog: PR tree-optimization/25290 * tree-ssa-phiopt.c (xor_replacement): Delete. (tree_ssa_phiopt_worker): Delete use of xor_replacement. (match_simplify_replacement): Allow one cheap preparation statement that can be moved to before the if. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr96928-1.c: Fix testcase for now that ~ happens on the outside of the bit_xor. --- gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c | 4 +- gcc/tree-ssa-phiopt.c | 164 +++--- 2 files changed, 54 insertions(+), 114 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c index a2770e5e896..2e86620da11 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c @@ -1,9 +1,9 @@ /* PR tree-optimization/96928 */ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-phiopt2" } */ +/* { dg-options "-O2 -fdump-tree-phiopt2 -fdump-tree-optimized" } */ /* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 "phiopt2" } } */ /* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 "phiopt2" } } */ -/* { dg-final { scan-tree-dump-times " = ~" 1 "phiopt2" } } */ +/* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 "phiopt2" } } */ /* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */ diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c index 969b868397e..76f4e7ec843 100644 --- a/gcc/tree-ssa-phiopt.c +++ b/gcc/tree-ssa-phiopt.c @@ -28,6 +28,7 @@ along with GCC; see the file COPYING3. If not see #include "cfghooks.h" #include "tree-pass.h" #include "ssa.h" +#include "tree-ssa.h" #include "optabs-tree.h" #include "insn-config.h" #include "gimple-pretty-print.h" @@ -63,8 +64,6 @@ static bool minmax_replacement (basic_block, basic_block, edge, edge, gphi *, tree, tree); static bool abs_replacement (basic_block, basic_block, edge, edge, gphi *, tree, tree); -static bool xor_replacement (basic_block, basic_block, -edge, edge, gphi *, tree, tree); static bool spaceship_replacement (basic_block, basic_block, edge, edge, gphi *, tree, tree); static bool cond_removal_in_popcount_clz_ctz_pattern (basic_block, basic_block, @@ -352,9 +351,6 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p) cfgchanged = true; else if (abs_replacement (bb, bb1, e1, e2, phi, arg0, arg1)) cfgchanged = true; - else if (!early_p - && xor_replacement (bb, bb1, e1, e2, phi, arg0, arg1)) - cfgchanged = true; else if (!early_p && cond_removal_in_popcount_clz_ctz_pattern (bb, bb1, e1, e2, phi, arg0, @@ -801,14 +797,51 @@ match_simplify_replacement (basic_block cond_bb, basic_block middle_bb, edge true_edge, false_edge; gimple_seq seq = NULL; tree result; - - if (!empty_block_p (middle_bb)) -return false; + gimple *stmt_to_move = NULL; /* Special case A ? B : B as this will always simplify to B. */ if (operand_equal_for_phi_arg_p (arg0, arg1)) return false; + /* If the basic block only has a cheap preparation statement, + allow it and move it once the transformation is done. */ + if (!empty_block_p (middle_bb)) +{ + stmt_to_move = last_and_only_stmt (middle_bb); + if (!stmt_to_move) + return false; + + if (gimple_vuse (stmt_to_move)) + return false; + + if (gimple_could_trap_p (stmt_to_move) + || gimple_has_side_effects (stmt_to_move)) + return false; + + if (gimple_uses_undefined_value_p (stmt_to_move)) + return false; + + /* Allow assignments and not no calls. +As const calls don't match any of the above, yet they could +still have some side-effects - they could contain +gimple_could_trap_p statements, like floating point +exceptions or integer division by zero. See PR70586. +FIXME: perhaps gimple_has_side_effects or gimple_could_trap_p +should handle
[Bug middle-end/54400] recognize vector reductions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54400 --- Comment #8 from Marc Glisse --- (In reply to Richard Biener from comment #7) > (note avoiding hadd in the reduc pattern was intended). Indeed. Except with -Os, or if a processor with a fast hadd appears, vectorising this doesn't bring anything. It doesn't hurt either though.
[Bug rtl-optimization/80770] suboptimal code negating a 1-bit _Bool field
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80770 Andrew Pinski changed: What|Removed |Added Status|NEW |ASSIGNED Severity|normal |enhancement Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org --- Comment #2 from Andrew Pinski --- Mine. Though if we lower, we will still need to optimize the following on the gimple level: _1 = BIT_FIELD_REF <_6, 1, 0>; _2 = ~_1; _8 = BIT_INSERT_EXPR <_6, _2, 0 (1 bits)>; to _8 = _6 ^ 1; Or in general: BIT_INSERT_EXPR <_6, bit_not (BIT_FIELD_REF <_6, bits, shift>), shift (bits)> to _6 ^ shiftedmask(bits, shift); And maybe add: BIT_INSERT_EXPR <_6, bit_op (BIT_FIELD_REF <_6, bits, shift>, B), shift (bits)> _6 bit_op (convert (convert:u B) << shift); Where u is the unsigned type if B is not an unsigned type.
[Bug c++/100879] [10/11/12 Regression] gcc is complaining of a signed compare when comparing enums of different types (same underlying type)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100879 Jason Merrill changed: What|Removed |Added Last reconfirmed||2021-06-08 Assignee|unassigned at gcc dot gnu.org |jason at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED CC||jason at gcc dot gnu.org Ever confirmed|0 |1
[PATCH 1/2] arm: Fix vcond_mask expander for MVE (PR target/100757)
The problem in this PR is that we call VPSEL with a mask of vector type instead of HImode. This happens because operand 3 in vcond_mask is the pre-computed vector comparison and has vector type. The fix is to transfer this value to VPR.P0 by comparing operand 3 with a vector of constant 1 of the same type as operand 3. The pr100757*.c testcases are derived from gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using different types and return values different from 0 and 1 to avoid commonalization with boolean masks. Reducing the number of iterations in pr100757-3.c from 32 to 8, we generate the code below: float a[32]; float fn1(int d) { int c = 4; for (int b = 0; b < 8; b++) if (a[b] != 2.0f) c = 5; return c; } fn1: ldr r3, .L4+80 vpush.64{d8, d9} vldrw.32q3, [r3]// q3=a[0..3] vldr.64 d8, .L4 // q4=(2.0,2.0,2.0,2.0) vldr.64 d9, .L4+8 addsr3, r3, #16 vcmp.f32eq, q3, q4 // cmp a[0..3] == (2.0,2.0,2.0,2.0) vldr.64 d2, .L4+16 // q1=(1,1,1,1) vldr.64 d3, .L4+24 vldrw.32q3, [r3]// q3=a[4..7] vldr.64 d4, .L4+32 // q2=(0,0,0,0) vldr.64 d5, .L4+40 vpsel q0, q1, q2// q0=select (a[0..3]) vcmp.f32eq, q3, q4 // cmp a[4..7] == (2.0,2.0,2.0,2.0) vldmsp!, {d8-d9} vpsel q2, q1, q2// q2=select (a[4..7]) vandq2, q0, q2 // q2=select (a[0..3]) && select (a[4..7]) vldr.64 d6, .L4+48 // q3=(4.0,4.0,4.0,4.0) vldr.64 d7, .L4+56 vldr.64 d0, .L4+64 // q0=(5.0,5.0,5.0,5.0) vldr.64 d1, .L4+72 vcmp.i32 eq, q2, q1// cmp mask(a[0..7]) == (1,1,1,1) vpsel q3, q3, q0// q3= vcond_mask(4.0,5.0) vmov.32 r3, q3[0] // keep the scalar max vmov.32 r1, q3[1] vmov.32 r0, q3[3] vmov.32 r2, q3[2] vmovs14, r1 vmovs15, r3 vmaxnm.f32 s15, s15, s14 vmovs14, r2 vmaxnm.f32 s15, s15, s14 vmovs14, r0 vmaxnm.f32 s15, s15, s14 vmovr0, s15 bx lr .L5: .align 3 .L4: .word 1073741824 .word 1073741824 .word 1073741824 .word 1073741824 .word 1 .word 1 .word 1 .word 1 .word 0 .word 0 .word 0 .word 0 .word 1082130432 .word 1082130432 .word 1082130432 .word 1082130432 .word 1084227584 .word 1084227584 .word 1084227584 .word 1084227584 2021-06-09 Christophe Lyon PR target/100757 gcc/ * config/arm/vec-common.md (vcond_mask_): Fix expansion for MVE. gcc/testsuite/ * gcc.target/arm/simd/pr100757.c: New test. * gcc.target/arm/simd/pr100757-2.c: New test. * gcc.target/arm/simd/pr100757-3.c: New test. --- gcc/config/arm/vec-common.md | 24 +-- .../gcc.target/arm/simd/pr100757-2.c | 20 .../gcc.target/arm/simd/pr100757-3.c | 20 gcc/testsuite/gcc.target/arm/simd/pr100757.c | 19 +++ 4 files changed, 81 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md index 0ffc7a9322c..ccdfaa8321f 100644 --- a/gcc/config/arm/vec-common.md +++ b/gcc/config/arm/vec-common.md @@ -478,8 +478,28 @@ (define_expand "vcond_mask_" } else if (TARGET_HAVE_MVE) { - emit_insn (gen_mve_vpselq (VPSELQ_S, mode, operands[0], - operands[1], operands[2], operands[3])); + /* Convert pre-computed vector comparison into VPR.P0 by comparing + operand 3 with a vector of '1', then use VPSEL. */ + machine_mode cmp_mode = GET_MODE (operands[3]); + rtx vpr_p0 = gen_reg_rtx (HImode); + rtx one = gen_reg_rtx (cmp_mode); + emit_move_insn (one, CONST1_RTX (cmp_mode)); + emit_insn (gen_mve_vcmpq (EQ, cmp_mode, vpr_p0, operands[3], one)); + + switch (GET_MODE_CLASS (mode)) +{ + case MODE_VECTOR_INT: +emit_insn (gen_mve_vpselq (VPSELQ_S, mode, operands[0], operands[1], operands[2], vpr_p0)); +break; + case MODE_VECTOR_FLOAT: + if (TARGET_HAVE_MVE_FLOAT) + emit_insn (gen_mve_vpselq_f (mode, operands[0], operands[1], operands[2], vpr_p0)); + else + gcc_unreachable (); +break; + default: +
[PATCH 2/2] arm: Fix fix arm_expand_vcond for MVE
This patch fixes a problem in arm_expand_vcond() where the result would be a vector of 0 or 1 instead of operand 1 or 2. The mve-vcmp-f32-2.c testcase is an update from mve-vcmp-f32.c using a conditional with 2.0f and 3.0f constants to help scan-assembler-times. 2021-06-09 Christophe Lyon gcc/ * config/arm/arm.c (arm_expand_vcond): Fix select operands. gcc/testsuite/ * gcc.target/arm/simd/mve-vcmp-f32-2.c: New test. --- gcc/config/arm/arm.c | 15 + .../gcc.target/arm/simd/mve-vcmp-f32-2.c | 32 +++ 2 files changed, 40 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 9377aaef342..35e22382650 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -31164,7 +31164,7 @@ arm_expand_vcond (rtx *operands, machine_mode cmp_result_mode) if (TARGET_HAVE_MVE) { - vcond_mve=true; + vcond_mve = true; mask = gen_reg_rtx (HImode); } else @@ -31181,18 +31181,19 @@ arm_expand_vcond (rtx *operands, machine_mode cmp_result_mode) { machine_mode cmp_mode = GET_MODE (operands[4]); rtx vpr_p0 = mask; - rtx zero = gen_reg_rtx (cmp_mode); - rtx one = gen_reg_rtx (cmp_mode); - emit_move_insn (zero, CONST0_RTX (cmp_mode)); - emit_move_insn (one, CONST1_RTX (cmp_mode)); + switch (GET_MODE_CLASS (cmp_mode)) { case MODE_VECTOR_INT: - emit_insn (gen_mve_vpselq (VPSELQ_S, cmp_result_mode, operands[0], one, zero, vpr_p0)); + emit_insn (gen_mve_vpselq (VPSELQ_S, cmp_result_mode, operands[0], +operands[1], operands[2], vpr_p0)); break; case MODE_VECTOR_FLOAT: if (TARGET_HAVE_MVE_FLOAT) - emit_insn (gen_mve_vpselq_f (cmp_mode, operands[0], one, zero, vpr_p0)); + emit_insn (gen_mve_vpselq_f (cmp_mode, operands[0], +operands[1], operands[2], vpr_p0)); + else + gcc_unreachable (); break; default: gcc_unreachable (); diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c new file mode 100644 index 000..917a95bf141 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c @@ -0,0 +1,32 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */ +/* { dg-add-options arm_v8_1m_mve_fp } */ +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */ + +#include + +#define NB 4 + +#define FUNC(OP, NAME) \ + void test_ ## NAME ##_f (float * __restrict__ dest, float *a, float *b) { \ +int i; \ +for (i=0; i, vcmpgt) +FUNC(>=, vcmpge) + +/* { dg-final { scan-assembler-times {\tvcmp.f32\teq, q[0-9]+, q[0-9]+\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tvcmp.f32\tne, q[0-9]+, q[0-9]+\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tvcmp.f32\tlt, q[0-9]+, q[0-9]+\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tvcmp.f32\tle, q[0-9]+, q[0-9]+\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tvcmp.f32\tgt, q[0-9]+, q[0-9]+\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tvcmp.f32\tge, q[0-9]+, q[0-9]+\n} 1 } } */ +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 24 } } */ /* Constant 2.0f. */ +/* { dg-final { scan-assembler-times {\t.word\t1077936128\n} 24 } } */ /* Constant 3.0f. */ -- 2.25.1
[Bug target/57890] gcc 4.8.1 regression: loops become slower
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57890 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |7.0 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED Known to fail||4.9.0, 6.1.0 Known to work||4.7.0, 7.1.0 --- Comment #6 from Andrew Pinski --- Fixed in GCC 7.0: f: movdqa xmm0, XMMWORD PTR .LC0[rip] mov DWORD PTR c[rip+96], 808464432 movaps XMMWORD PTR c[rip], xmm0 movaps XMMWORD PTR c[rip+16], xmm0 movaps XMMWORD PTR c[rip+32], xmm0 movaps XMMWORD PTR c[rip+48], xmm0 movaps XMMWORD PTR c[rip+64], xmm0 movaps XMMWORD PTR c[rip+80], xmm0 ret .LC0: .quad 3472328296227680304 .quad 3472328296227680304 Where GCC 4.7.0 had produced (which is just as ok): f: movdqa xmm0, XMMWORD PTR .LC0[rip] mov BYTE PTR c[rip+96], 48 mov BYTE PTR c[rip+97], 48 movdqa XMMWORD PTR c[rip], xmm0 mov BYTE PTR c[rip+98], 48 mov BYTE PTR c[rip+99], 48 movdqa XMMWORD PTR c[rip+16], xmm0 movdqa XMMWORD PTR c[rip+32], xmm0 movdqa XMMWORD PTR c[rip+48], xmm0 movdqa XMMWORD PTR c[rip+64], xmm0 movdqa XMMWORD PTR c[rip+80], xmm0 ret .LC0: .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48 .byte 48
Re: [PATCH] c++: explicit() ignored on deduction guide [PR100065]
On 6/7/21 8:06 PM, Marek Polacek wrote: When we have explicit() with a value-dependent argument, we can't evaluate it at parsing time, so cp_parser_function_specifier_opt stashes the argument into the decl-specifiers and grokdeclarator then stores it into explicit_specifier_map, which is then used when substituting the function decl. grokdeclarator stores it for constructors and conversion functions, but we also need to do it for deduction guides, otherwise we'll forget that we've seen an explicit-specifier as in the attached test. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/branches? OK. PR c++/100065 gcc/cp/ChangeLog: * decl.c (grokdeclarator): Store a value-dependent explicit-specifier even for deduction guides. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/explicit18.C: New test. --- gcc/cp/decl.c | 2 ++ gcc/testsuite/g++.dg/cpp2a/explicit18.C | 23 +++ 2 files changed, 25 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit18.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index a3687dbb0dd..cbf647dd569 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -14043,6 +14043,8 @@ grokdeclarator (const cp_declarator *declarator, storage_class = sc_none; } } + if (declspecs->explicit_specifier) + store_explicit_specifier (decl, declspecs->explicit_specifier); } else { diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit18.C b/gcc/testsuite/g++.dg/cpp2a/explicit18.C new file mode 100644 index 000..c8916fa4743 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/explicit18.C @@ -0,0 +1,23 @@ +// PR c++/100065 +// { dg-do compile { target c++20 } } + +template +struct bool_constant { + static constexpr bool value = B; + constexpr operator bool() const { return value; } +}; + +using true_type = bool_constant; +using false_type = bool_constant; + +template +struct X { +template +X(T); +}; + +template +explicit(b) X(bool_constant) -> X; + +X false_ = false_type{}; // OK +X true_ = true_type{}; // { dg-error "explicit deduction guide" } base-commit: e89759fdfc80db223bd852aba937acb2d7c2cd80
[Bug tree-optimization/49203] missed-optimization: useless expressions not moved out of loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49203 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED Known to work||7.0 Target Milestone|--- |7.0 Known to fail||4.8.5 --- Comment #3 from Andrew Pinski --- Fixed in at least in GCC 7.0: .L2: leaq16(%r8), %rsi movq%r8, %rdx xorl%eax, %eax .p2align 4,,10 .p2align 3 .L3: movzbl (%rdx), %ecx sall$2, %eax addq$1, %rdx andl$3, %ecx orl %ecx, %eax cmpq%rsi, %rdx jne .L3 movl%eax, %edx addq$4, %r8 movb%ah, 2(%rdi) shrl$24, %edx movb%al, 3(%rdi) addq$4, %rdi movb%dl, -4(%rdi) movl%eax, %edx shrl$16, %edx movb%dl, -3(%rdi) cmpq%r8, %r9 jne .L2 [94.12%]: # tmp_37 = PHI # ivtmp.17_17 = PHI _1 = tmp_37 << 2; _87 = (void *) ivtmp.17_17; _3 = MEM[base: _87, offset: 0B]; _20 = _3 & 3; _4 = (unsigned int) _20; tmp_23 = _1 | _4; ivtmp.17_15 = ivtmp.17_17 + 1; if (ivtmp.17_15 != _83) goto ; [93.75%] else goto ; [6.25%] [5.88%]: _5 = tmp_23 >> 24; _6 = (unsigned char) _5; _76 = (void *) ivtmp.27_82; MEM[base: _76, offset: 0B] = _6; _7 = tmp_23 >> 16; _9 = (unsigned char) _7; MEM[base: _76, offset: 1B] = _9; _10 = tmp_23 >> 8; _12 = (unsigned char) _10; MEM[base: _76, offset: 2B] = _12; _14 = (unsigned char) tmp_23; MEM[base: _76, offset: 3B] = _14; ivtmp.27_81 = ivtmp.27_82 + 4; ivtmp.28_78 = ivtmp.28_79 + 4; if (_71 != ivtmp.28_78) goto ; [87.51%] else goto ; [12.49%] [5.88%]: # ivtmp.27_82 = PHI # ivtmp.28_79 = PHI _83 = ivtmp.28_79 + 16; goto ; [100.00%]
[Bug c++/91706] [9/10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in equate_type_number_to_die, at dwarf2out.c:5782
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91706 Jason Merrill changed: What|Removed |Added Summary|[9/10/11/12 Regression] |[9/10 Regression] ICE: tree |ICE: tree check: expected |check: expected class |class 'type', have |'type', have 'exceptional' |'exceptional' (error_mark) |(error_mark) in |in |equate_type_number_to_die, |equate_type_number_to_die, |at dwarf2out.c:5782 |at dwarf2out.c:5782 | --- Comment #13 from Jason Merrill --- Fixed for 11.2/12 so far. Is there interest in fixing this on the 9/10 branches?
[Bug c++/100752] [11/12 Regression] error: cannot call member function ‘void S::f()’ without object
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100752 --- Comment #4 from Jason Merrill --- As I mentioned on IRC, it seems like this may just be a matter of properly passing down flags/member_p in the recursive call to cp_parser_declarator.
Re: [PATCH] For obj-c stage-final re-use the checksum from the previous stage
On Tue, Jun 8, 2021 at 5:05 PM Bernd Edlinger wrote: > On 6/8/21 3:54 PM, Jason Merrill wrote: > > > > This breaks bootstrap2. > > > > Jason > > > > > Sorry for the breakage, > > I've committed the following as obvious after > confirming that it fixes bootstrap2: > Thanks. Jason
[Bug c++/100752] [11/12 Regression] error: cannot call member function ‘void S::f()’ without object
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100752 --- Comment #3 from Marek Polacek --- Duh, we don't defer parsing of noexcept for any ptr-operator, like struct S { int& f() noexcept(noexcept(i)); int i; }; Awkward, but the fix should be simple.
Re: [PATCH v2] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]
Hi! On Tue, Jun 08, 2021 at 09:11:33AM +0800, Xionghu Luo wrote: > On P8LE, extra rot64+rot64 load or store instructions are generated > in float128 to vector __int128 conversion. > > This patch teaches pass swaps to also handle such pattens to remove > extra swap instructions. > +/* Return 1 iff PAT is a rotate 64 bit expression; else return 0. */ > + > +static bool > +pattern_is_rotate64_p (rtx pat) You already have a verb in the name, don't use _p please (and preferably just don't use it at all, "pattern_is_rotate64" is much better than "pattern_rotate64_p"). > +{ > + rtx rot = SET_SRC (pat); So this is assuming PAT is a SINGLE_SET. Please say that in the function comment. /* Return 1 iff PAT (a SINGLE_SET) is a rotate 64 bit expression; else return 0. */ You can do an assert for that as well, but I wouldn't bother. > @@ -266,6 +280,9 @@ insn_is_load_p (rtx insn) (I do realise you just copied existing naming, don't worry :-) ) > @@ -392,7 +411,8 @@ quad_aligned_load_p (swap_web_entry *insn_entry, rtx_insn > *insn) > false. */ >rtx body = PATTERN (def_insn); >if (GET_CODE (body) != SET > - || GET_CODE (SET_SRC (body)) != VEC_SELECT > + || !(GET_CODE (SET_SRC (body)) == VEC_SELECT > + || pattern_is_rotate64_p (body)) Broken indentation: the || should align with "pattern...". > @@ -2223,9 +2246,9 @@ static void > recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete) > { >rtx body = PATTERN (insn); > - gcc_assert (GET_CODE (body) == SET > - && MEM_P (SET_DEST (body)) > - && GET_CODE (SET_SRC (body)) == VEC_SELECT); > + gcc_assert (GET_CODE (body) == SET && MEM_P (SET_DEST (body)) > + && (GET_CODE (SET_SRC (body)) == VEC_SELECT > + || pattern_is_rotate64_p (body))); Please start a new line for every "&&" here. The way it was was more readable. It often is nice to keep things one one line, if it fits on one line. If it does not, make a new line for every phrase. This is more readable because you can then just scan down the line of "&&" and see the start of every phrase without actually having to read it all. > diff --git a/gcc/testsuite/gcc.target/powerpc/float128-call.c > b/gcc/testsuite/gcc.target/powerpc/float128-call.c > index 5895416e985..a1f09df8a57 100644 > --- a/gcc/testsuite/gcc.target/powerpc/float128-call.c > +++ b/gcc/testsuite/gcc.target/powerpc/float128-call.c > @@ -21,5 +21,5 @@ > TYPE one (void) { return ONE; } > void store (TYPE a, TYPE *p) { *p = a; } > > -/* { dg-final { scan-assembler "lxvd2x 34" } } */ > -/* { dg-final { scan-assembler "stxvd2x 34" } } */ > +/* { dg-final { scan-assembler "lvx 2" } } */ > +/* { dg-final { scan-assembler "stvx 2" } } */ Huh. Is that correct? Where did the other 32 loads and stores go? Are there now other insns generated that you should scan for? > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr100085.c > @@ -0,0 +1,24 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_float128_sw_ok } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */ If you use float128_ok you should use -mfloat128 (or this is very surprising and is worth an explanation itself :-) ) But, you do not need it, since you use -mcpu=power8 already (which implicitly sets this). So just remove that dg-require please. > +/* { dg-final { scan-assembler-not "xxpermdi" } } */ > +/* { dg-final { scan-assembler-not "stxvd2x" } } */ > +/* { dg-final { scan-assembler-not "lxvd2x" } } */ It is a good habit to use \m and \M in the scans where you can (those are the same as \< and \> are in some other regexp dialects). They aren't absolutely necessary here of course. Okay for trunk with those fixes. Thanks! Segher
Re: [PATCH] For obj-c stage-final re-use the checksum from the previous stage
On 6/8/21 3:54 PM, Jason Merrill wrote: > > This breaks bootstrap2. > > Jason > Sorry for the breakage, I've committed the following as obvious after confirming that it fixes bootstrap2: Subject: [PATCH] Fix bootstrap2 breakage due to re-use of obj-c checksum gcc/objc: 2021-06-08 Bernd Edlinger * Make-lang.in (cc1-obj-checksum.c): Check previous stage checksum exists. gcc/objcp: 2021-06-08 Bernd Edlinger * Make-lang.in (cc1objplus-checksum.c): Check previous stage checksum exists. --- gcc/objc/Make-lang.in | 3 ++- gcc/objcp/Make-lang.in | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/objc/Make-lang.in b/gcc/objc/Make-lang.in index 9011140..25fbd4c 100644 --- a/gcc/objc/Make-lang.in +++ b/gcc/objc/Make-lang.in @@ -63,7 +63,8 @@ objc_OBJS = $(OBJC_OBJS) cc1obj-checksum.o cc1obj-checksum.c : build/genchecksum$(build_exeext) checksum-options \ $(OBJC_OBJS) $(C_AND_OBJC_OBJS) $(BACKEND) $(LIBDEPS) if [ -f ../stage_final ] \ - && cmp -s ../stage_current ../stage_final; then \ + && cmp -s ../stage_current ../stage_final \ + && [ -f ../prev-gcc/$@ ]; then \ cp ../prev-gcc/$@ $@; \ else \ build/genchecksum$(build_exeext) $(OBJC_OBJS) $(C_AND_OBJC_OBJS) \ diff --git a/gcc/objcp/Make-lang.in b/gcc/objcp/Make-lang.in index 3ecc50b..2e27be5 100644 --- a/gcc/objcp/Make-lang.in +++ b/gcc/objcp/Make-lang.in @@ -66,7 +66,8 @@ obj-c++_OBJS = $(OBJCXX_OBJS) cc1objplus-checksum.o cc1objplus-checksum.c : build/genchecksum$(build_exeext) checksum-options \ $(OBJCXX_OBJS) $(BACKEND) $(CODYLIB) $(LIBDEPS) if [ -f ../stage_final ] \ - && cmp -s ../stage_current ../stage_final; then \ + && cmp -s ../stage_current ../stage_final \ + && [ -f ../prev-gcc/$@ ]; then \ cp ../prev-gcc/$@ $@; \ else \ build/genchecksum$(build_exeext) $(OBJCXX_OBJS) $(BACKEND) \ -- 1.9.1 Thanks Bernd.
[Bug c++/100976] [C++23] Make constexpr reference temp constexpr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100976 --- Comment #2 from Jason Merrill --- Or rather, int main() { constexpr const int = 42; static_assert(r == 42); // { dg-bogus "" } } [expr.const]/4.7 says that "a temporary object of non-volatile const-qualified literal type whose lifetime is extended to that of a variable that is usable in constant expressions" is usable in a constant expression.
Re: [PATCH 02/57] Support scanning of build-time GC roots in gengtype
On 6/7/21 12:48 PM, Bill Schmidt wrote: On 6/7/21 12:45 PM, Richard Biener wrote: On Mon, Jun 7, 2021 at 5:38 PM Bill Schmidt wrote: On 6/7/21 8:36 AM, Richard Biener wrote: Some maybe obvious issue - what about DOS-style path hosts? You seem to build ../ strings to point to parent dirs... I'm not sure what we do elsewhere - I suppose we arrange for appropriate -I command line arguments? Well, actually it's just using "./" to identify the build directory, though I see what you mean about potential Linux bias. There is precedent for this syntax identifying the build directory in config.gcc for target macro files: # tm_file A list of target macro files, if different from # "$cpu_type/$cpu_type.h". Usually it's constructed # per target in a way like this: # tm_file="${tm_file} dbxelf.h elfos.h ${cpu_type.h}/elf.h" # Note that the preferred order is: # - specific target header "${cpu_type}/${cpu_type.h}" # - generic headers like dbxelf.h elfos.h, etc. # - specializing target headers like ${cpu_type.h}/elf.h # This helps to keep OS specific stuff out of the CPU # defining header ${cpu_type}/${cpu_type.h}. # # It is possible to include automatically-generated # build-directory files by prefixing them with "./". # All other files should relative to $srcdir/config. ...so I thought I would try to be consistent with this change. In patch 0025 I use this as follows: --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -491,6 +491,7 @@ powerpc*-*-*) extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt" target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.c \$(srcdir)/config/rs6000/rs6000-call.c" target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-pcrel-opt.c" + target_gtfiles="$target_gtfiles ./rs6000-builtins.h" ;; pru-*-*) cpu_type=pru I'm open to trying to do something different if you think that's appropriate. Well, I'm not sure whether/how to resolve this. You could try building a cross to powerpc-linux from a x86_64-mingw host ... maybe there's one on the CF? Or some of your fellow RedHat people have access to mingw or the like envs to try whether it just works with your change ... Otherwise it looks OK. I'll see what I can find. Thanks again for reviewing the patch! Hm. Ultimately, I think the cross compiler case is doomed unless mingw already handles converting forward slashes to back slashes. There's no single syntax that works on both Windows and Linux. (There's no mingw server in the compile farm to play with.) I'm inclined to accept both "./" and ".\" for native builds, and kick the can down the road beyond that. What do you think? Bill Bill Richard. Thanks for your help with this! Bill
[Bug target/100973] gcc does not optimise based on knowing that `_mm256_movemask_ps` returns less than 255
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100973 Andrew Pinski changed: What|Removed |Added Target||x86_64-linux-gnu Keywords||missed-optimization Last reconfirmed||2021-06-08 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- This is a target/tree-optimization. Basically Tree level optimization has no idea what the builtin does and there is no target hook to querry the back-end for ranges: _3 = __builtin_ia32_movmskps256D.2066 (values_2(D));
Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]
Hi! On Fri, Jun 04, 2021 at 09:40:58AM +0800, Xionghu Luo wrote: > >> Combine still fail to merge the two instructions: > >> > >> Trying 6 -> 7: > >> 6: r120:KF#0=r125:KF#0<-<0x40 > >>REG_DEAD r125:KF > >> 7: [sfp:DI+r123:DI]=r120:KF#0<-<0x40 > >>REG_DEAD r120:KF > >> Successfully matched this instruction: > >> (set (mem/c:V1TI (plus:DI (reg/f:DI 110 sfp) > >> (reg:DI 123)) [1 S16 A128]) > >> (subreg:V1TI (reg:KF 125) 0)) > >> rejecting combination of insns 6 and 7 > >> original costs 4 + 4 = 8 > >> replacement cost 12 > > > > So what instructions were these? Why did the store cost 4 but the new > > one costs 12? The *vsx_le_perm_store_ instruction has the *preferred* alternative with cost 12, while the other alternative has cost 8. Why is that? That looks like a bug. (set_attr "length" "12,8") > >> By hacking the vsx_le_perm_store_v1ti INSN_COST from 12 to 8, > > > > It should be the same cost as the other store! > > vsx_le_permute_v1ti's cost is defined to 4 in vsx.md: Yes. Why is alternative 0 of *vsx_le_perm_store_ said to have a length of 3 insns? Segher
[Bug testsuite/100407] New test cases attr-retain-*.c fail after their introduction in r11-7284
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100407 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #12 from Segher Boessenkool --- (In reply to H.J. Lu from comment #10) > > unused_rodata: > > .section.sdata.used_rodata,"awR" This is symbol *un*used_rodata. > used_rodata is in a writable section. Is this intentional? -m64 generates Does -mno-readonly-in-sdata help? Does -msdata=none help?
[Bug fortran/100950] ICE in output_constructor_regular_field, at varasm.c:5514
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100950 --- Comment #8 from anlauf at gcc dot gnu.org --- Created attachment 50967 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50967=edit Tentativ fix This patch would fix the testcase. It is inspired by code in primary.c, match_string_constant. Not regtested.
[Bug analyzer/99212] [11 Regression] gcc.dg/analyzer/data-model-1.c line 971
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99212 David Malcolm changed: What|Removed |Added Summary|[11/12 Regression] |[11 Regression] |gcc.dg/analyzer/data-model- |gcc.dg/analyzer/data-model- |1.c line 971|1.c line 971 --- Comment #16 from David Malcolm --- Should be fixed on trunk (for gcc 12) by the above commit
[Bug libstdc++/100940] views::take and views::drop should not define _S_has_simple_extra_args
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100940 Patrick Palka changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ppalka at gcc dot gnu.org Status|NEW |ASSIGNED
[PATCH v2 2/2] rs6000: Add test for _mm_minpos_epu16
Copy the test for _mm_minpos_epu16 from gcc/testsuite/gcc.target/i386/sse4_1-phminposuw.c, with a few adjustments: - Adjust the dejagnu directives for powerpc platform. - Make the data not be monotonically increasing, such that some of the returned values are not always the first value (index 0). - Create a list of input data testing various scenarios including more than one minimum value and different orders and indicies of the minimum value. - Fix a masking issue where the index was being truncated to 2 bits instead of 3 bits, which wasn't found because all of the returned indicies were 0 with the original generated data. - Support big-endian. 2021-06-08 Paul A. Clarke gcc/testsuite/ChangeLog: * gcc.target/powerpc/sse4_1-phminposuw.c: Copy from gcc/testsuite/gcc.target/i386, make more robust. --- .../gcc.target/powerpc/sse4_1-phminposuw.c| 68 +++ 1 file changed, 68 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c b/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c new file mode 100644 index ..3bb5a2dfe4f5 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c @@ -0,0 +1,68 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mpower8-vector -Wno-psabi" } */ +/* { dg-require-effective-target p8vector_hw } */ + +#define NO_WARN_X86_INTRINSICS 1 +#ifndef CHECK_H +#define CHECK_H "sse4_1-check.h" +#endif + +#ifndef TEST +#define TEST sse4_1_test +#endif + +#include CHECK_H + +#include + +#define DIM(a) (sizeof (a) / sizeof ((a)[0])) + +static void +TEST (void) +{ + union +{ + __m128i x; + unsigned short s[8]; +} src[] = +{ + { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x } }, + { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x } }, + { .s = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x } }, + { .s = { 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008 } }, + { .s = { 0x0008, 0x0007, 0x0006, 0x0005, 0x0004, 0x0003, 0x0002, 0x0001 } }, + { .s = { 0xfff4, 0xfff3, 0xfff2, 0xfff1, 0xfff3, 0xfff1, 0xfff2, 0xfff3 } } +}; + unsigned short minVal[DIM (src)]; + int minInd[DIM (src)]; + unsigned short minValScalar, minIndScalar; + int i, j; + union +{ + int i; + unsigned short s[2]; +} res; + + for (i = 0; i < DIM (src); i++) +{ + res.i = _mm_cvtsi128_si32 (_mm_minpos_epu16 (src[i].x)); + minVal[i] = res.s[0]; + minInd[i] = res.s[1] & 0b111; +} + + for (i = 0; i < DIM (src); i++) +{ + minValScalar = src[i].s[0]; + minIndScalar = 0; + + for (j = 1; j < 8; j++) + if (minValScalar > src[i].s[j]) + { + minValScalar = src[i].s[j]; + minIndScalar = j; + } + + if (minValScalar != minVal[i] && minIndScalar != minInd[i]) + abort (); +} +} -- 2.27.0
[PATCH v2 1/2] rs6000: Add support for _mm_minpos_epu16
Add a naive implementation of the subject x86 intrinsic to ease porting. 2021-06-08 Paul A. Clarke gcc/ChangeLog: * config/rs6000/smmintrin.h (_mm_minpos_epu16): New. --- gcc/config/rs6000/smmintrin.h | 25 + 1 file changed, 25 insertions(+) diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h index bdf6eb365d88..b7de38763f2b 100644 --- a/gcc/config/rs6000/smmintrin.h +++ b/gcc/config/rs6000/smmintrin.h @@ -116,4 +116,29 @@ _mm_blendv_epi8 (__m128i __A, __m128i __B, __m128i __mask) return (__m128i) vec_sel ((__v16qu) __A, (__v16qu) __B, __lmask); } +/* Return horizontal packed word minimum and its index in bits [15:0] + and bits [18:16] respectively. */ +extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_mm_minpos_epu16 (__m128i __A) +{ + union __u +{ + __m128i __m; + __v8hu __uh; +}; + union __u __u = { .__m = __A }, __r = { .__m = {0} }; + unsigned short __ridx = 0; + unsigned short __rmin = __u.__uh[__ridx]; + for (unsigned long __i = __ridx + 1; __i < 8; __i++) +{ + if (__u.__uh[__i] < __rmin) +{ + __rmin = __u.__uh[__i]; + __ridx = __i; +} +} + __r.__uh[0] = __rmin; + __r.__uh[1] = __ridx; + return __r.__m; +} #endif -- 2.27.0
[PATCH v2 0/2] rs6000: Add support for _mm_minpos_epu16
Added compatible implementation of _mm_minpos_epu16 for powerpc. Copied, improved, and fixed testcase from i386. Tested on BE, LE (32 and 64bit). Paul A. Clarke (2): rs6000: Add support for _mm_minpos_epu16 rs6000: Add test for _mm_minpos_epu16 gcc/config/rs6000/smmintrin.h | 25 +++ .../gcc.target/powerpc/sse4_1-phminposuw.c| 68 +++ 2 files changed, 93 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-phminposuw.c -- 2.27.0
[PATCH] PR middle-end/53267: Constant fold BUILT_IN_FMOD.
Here's a three line patch to implement constant folding for fmod, fmodf and fmodl, which resolves an enhancement request from 2012. The following patch has been tested on x86_64-pc-linux-gnu with a make bootstrap and make -k check with no new failures. Ok for mainline? 2020-06-08 Roger Sayle gcc/ChangeLog PR middle-end/53267 * fold-const-call.c (fold_const_call_sss) [CASE_CFN_FMOD]: Support evaluation of fmod/fmodf/fmodl at compile-time. gcc/testsuite/ChangeLog * gcc.dg/builtins-70.c: New test. Roger -- Roger Sayle NextMove Software Cambridge, UK diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c index a1d70b6..d6cb9b1 100644 --- a/gcc/fold-const-call.c +++ b/gcc/fold-const-call.c @@ -1375,6 +1375,9 @@ fold_const_call_sss (real_value *result, combined_fn fn, CASE_CFN_FDIM: return do_mpfr_arg2 (result, mpfr_dim, arg0, arg1, format); +CASE_CFN_FMOD: + return do_mpfr_arg2 (result, mpfr_fmod, arg0, arg1, format); + CASE_CFN_HYPOT: return do_mpfr_arg2 (result, mpfr_hypot, arg0, arg1, format); /* Copyright (C) 2021 Free Software Foundation. Check that constant folding of built-in fmod functions doesn't break anything and produces the expected results. /* { dg-do link } */ /* { dg-options "-O2 -ffast-math" } */ extern void link_error(void); extern double fmod(double,double); extern float fmodf(float,float); extern long double fmodl(long double,long double); int main() { if (fmod (6.5, 2.3) < 1.8999 || fmod (6.5, 2.3) > 1.9001) link_error (); if (fmod (-6.5, 2.3) < -1.9001 || fmod (-6.5, 2.3) > -1.8999) link_error (); if (fmod (6.5, -2.3) < 1.8999 || fmod (6.5, -2.3) > 1.9001) link_error (); if (fmod (-6.5, -2.3) < -1.9001 || fmod (-6.5, -2.3) > -1.8999) link_error (); if (fmodf (6.5f, 2.3f) < 1.8999f || fmodf (6.5f, 2.3f) > 1.9001f) link_error (); if (fmodf (-6.5f, 2.3f) < -1.9001f || fmodf (-6.5f, 2.3f) > -1.8999f) link_error (); if (fmodf (6.5f, -2.3f) < 1.8999f || fmodf (6.5f, -2.3f) > 1.9001f) link_error (); if (fmodf (-6.5f, -2.3f) < -1.9001f || fmodf (-6.5f, -2.3f) > -1.8999f) link_error (); if (fmodl (6.5l, 2.3l) < 1.8999l || fmod (6.5l, 2.3l) > 1.9001l) link_error (); if (fmodl (-6.5l, 2.3l) < -1.9001l || fmod (-6.5l, 2.3l) > -1.8999l) link_error (); if (fmodl (6.5l, -2.3l) < 1.8999l || fmod (6.5l, -2.3l) > 1.9001l) link_error (); if (fmodl (-6.5l, -2.3l) < -1.9001l || fmod (-6.5l, -2.3l) > -1.8999l) link_error (); return 0; }
[Bug libstdc++/100940] views::take and views::drop should not define _S_has_simple_extra_args
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100940 --- Comment #5 from Patrick Palka --- (In reply to TC from comment #4) > (In reply to Patrick Palka from comment #3) > > Good point, confirmed. Though I'm not sure if perfect forwarding here is > > strictly necessary to fix this testcase. Perhaps the > > _S_has_simple_extra_args versions of _Partial should be forwarding the bound > > arguments as prvalues instead of as const lvalues? > > It's pretty easy to come up with counterexamples that don't work (for > example, the type might be move-only). > > It may be better to limit the "simple" case for take/drop to when the > argument type is integer-like; that's like 99% of uses anyway. Contrived > examples gets the perfect forwarding fun but that's fine. > > Similarly, it might be a good idea to restrict the "simple" case for the > other adaptors a bit - perhaps to the case where the predicate is trivially > copyable, which should still give good diagnostic for a lot of uses, but > avoids a performance hit if the function object at issue is > like...std::function. That makes sense to me. Implementation wise I guess this would mean parameterizing the _S_has_simple_extra_args flag by the actual types of the extra arguments. And I suppose we could also use this to declare some partial applications of split to be simple, e.g. when the pattern argument is a scalar or a view, and get good diagnostics for split in these cases.
[committed] analyzer: bitfield fixes [PR99212]
This patch verifies the previous fix for bitfield sizes by implementing enough support for bitfields in the analyzer to get the test cases to pass. The patch implements support in the analyzer for reading from a BIT_FIELD_REF, and support for folding BIT_AND_EXPR of a mask, to handle the cases generated in tests. The existing bitfields tests in data-model-1.c turned out to rely on undefined behavior, in that they were assigning values to a signed bitfield that were outside of the valid range of values. I believe that that's why we were seeing target-specific differences in the test results (PR analyzer/99212). The patch updates the test to remove the undefined behaviors. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Lightly tested with cris-elf. Pushed to trunk as r12-1303-gd3b1ef7a83c0c0cd5b20a1dd1714b868f3d2b442. gcc/analyzer/ChangeLog: PR analyzer/99212 * region-model-manager.cc (region_model_manager::maybe_fold_binop): Add support for folding BIT_AND_EXPR of compound_svalue and a mask constant. * region-model.cc (region_model::get_rvalue_1): Implement BIT_FIELD_REF in terms of... (region_model::get_rvalue_for_bits): New function. * region-model.h (region_model::get_rvalue_for_bits): New decl. * store.cc (bit_range::from_mask): New function. (selftest::test_bit_range_intersects_p): New selftest. (selftest::assert_bit_range_from_mask_eq): New. (ASSERT_BIT_RANGE_FROM_MASK_EQ): New macro. (selftest::assert_no_bit_range_from_mask_eq): New. (ASSERT_NO_BIT_RANGE_FROM_MASK): New macro. (selftest::test_bit_range_from_mask): New selftest. (selftest::analyzer_store_cc_tests): Call the new selftests. * store.h (bit_range::intersects_p): New. (bit_range::from_mask): New decl. (concrete_binding::get_bit_range): New accessor. (store_manager::get_concrete_binding): New overload taking const bit_range &. gcc/testsuite/ChangeLog: PR analyzer/99212 * gcc.dg/analyzer/bitfields-1.c: New test. * gcc.dg/analyzer/data-model-1.c (struct sbits): Make bitfields explicitly signed. (test_44): Update test values assigned to the bits to ones that fit in the range of the bitfield type. Remove xfails. (test_45): Remove xfails. Signed-off-by: David Malcolm --- gcc/analyzer/region-model-manager.cc | 46 - gcc/analyzer/region-model.cc | 65 ++- gcc/analyzer/region-model.h | 4 + gcc/analyzer/store.cc| 186 +++ gcc/analyzer/store.h | 18 ++ gcc/testsuite/gcc.dg/analyzer/bitfields-1.c | 144 ++ gcc/testsuite/gcc.dg/analyzer/data-model-1.c | 30 +-- 7 files changed, 469 insertions(+), 24 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/bitfields-1.c diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc index dfd2413e914..0ca0c8ad02e 100644 --- a/gcc/analyzer/region-model-manager.cc +++ b/gcc/analyzer/region-model-manager.cc @@ -480,9 +480,49 @@ region_model_manager::maybe_fold_binop (tree type, enum tree_code op, break; case BIT_AND_EXPR: if (cst1) - if (zerop (cst1) && INTEGRAL_TYPE_P (type)) - /* "(ARG0 & 0)" -> "0". */ - return get_or_create_constant_svalue (build_int_cst (type, 0)); + { + if (zerop (cst1) && INTEGRAL_TYPE_P (type)) + /* "(ARG0 & 0)" -> "0". */ + return get_or_create_constant_svalue (build_int_cst (type, 0)); + + /* Support masking out bits from a compound_svalue, as this +is generated when accessing bitfields. */ + if (const compound_svalue *compound_sval + = arg0->dyn_cast_compound_svalue ()) + { + const binding_map = compound_sval->get_map (); + unsigned HOST_WIDE_INT mask = TREE_INT_CST_LOW (cst1); + /* If "mask" is a contiguous range of set bits, see if the +compound_sval has a value for those bits. */ + bit_range bits (0, 0); + if (bit_range::from_mask (mask, )) + { + const concrete_binding *conc + = get_store_manager ()->get_concrete_binding (bits, + BK_direct); + if (const svalue *sval = map.get (conc)) + { + /* We have a value; +shift it by the correct number of bits. */ + const svalue *lhs = get_or_create_cast (type, sval); + HOST_WIDE_INT bit_offset + = bits.get_start_bit_offset ().to_shwi (); + tree shift_amt = build_int_cst (type, bit_offset); + const svalue *shift_sval +
[committed] analyzer: fix region::get_bit_size for bitfields
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as c957d38044d7eb6a45f57a8a9f707c3c0a798e9f. gcc/analyzer/ChangeLog: * analyzer.h (int_size_in_bits): New decl. * region.cc (int_size_in_bits): New function. (region::get_bit_size): Reimplement in terms of the above. Signed-off-by: David Malcolm --- gcc/analyzer/analyzer.h | 2 ++ gcc/analyzer/region.cc | 33 + 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h index fb568e44d38..525eb06c3b5 100644 --- a/gcc/analyzer/analyzer.h +++ b/gcc/analyzer/analyzer.h @@ -144,6 +144,8 @@ typedef offset_int bit_offset_t; typedef offset_int bit_size_t; typedef offset_int byte_size_t; +extern bool int_size_in_bits (const_tree type, bit_size_t *out); + /* The location of a region expressesd as an offset relative to a base region. */ diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc index 6db1fc91afd..5f246df7dfb 100644 --- a/gcc/analyzer/region.cc +++ b/gcc/analyzer/region.cc @@ -208,6 +208,29 @@ region::get_byte_size (byte_size_t *out) const return true; } +/* If the size of TYPE (in bits) is constant, write it to *OUT + and return true. + Otherwise return false. */ + +bool +int_size_in_bits (const_tree type, bit_size_t *out) +{ + if (INTEGRAL_TYPE_P (type)) +{ + *out = TYPE_PRECISION (type); + return true; +} + + tree sz = TYPE_SIZE (type); + if (sz && tree_fits_uhwi_p (sz)) +{ + *out = TREE_INT_CST_LOW (sz); + return true; +} + else +return false; +} + /* If the size of this region (in bits) is known statically, write it to *OUT and return true. Otherwise return false. */ @@ -215,11 +238,13 @@ region::get_byte_size (byte_size_t *out) const bool region::get_bit_size (bit_size_t *out) const { - byte_size_t byte_size; - if (!get_byte_size (_size)) + tree type = get_type (); + + /* Bail out e.g. for heap-allocated regions. */ + if (!type) return false; - *out = byte_size * BITS_PER_UNIT; - return true; + + return int_size_in_bits (type, out); } /* Get the field within RECORD_TYPE at BIT_OFFSET. */ -- 2.26.3
[committed] analyzer: split out struct bit_range from class concrete_binding
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as 6b400aef1bdc84bbdf5011caff3fe5f82c68d253. gcc/analyzer/ChangeLog: * store.cc (concrete_binding::dump_to_pp): Move bulk of implementation to... (bit_range::dump_to_pp): ...this new function. (bit_range::cmp): New. (concrete_binding::overlaps_p): Update for use of bit_range. (concrete_binding::cmp_ptr_ptr): Likewise. * store.h (struct bit_range): New. (class concrete_binding): Replace fields m_start_bit_offset and m_size_in_bits with new field m_bit_range. Signed-off-by: David Malcolm --- gcc/analyzer/store.cc | 38 +++ gcc/analyzer/store.h | 61 +++ 2 files changed, 77 insertions(+), 22 deletions(-) diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc index b1874a5a2d3..f4bb7def781 100644 --- a/gcc/analyzer/store.cc +++ b/gcc/analyzer/store.cc @@ -236,15 +236,12 @@ binding_key::cmp (const binding_key *k1, const binding_key *k2) } } -/* class concrete_binding : public binding_key. */ - -/* Implementation of binding_key::dump_to_pp vfunc for concrete_binding. */ +/* struct struct bit_range. */ void -concrete_binding::dump_to_pp (pretty_printer *pp, bool simple) const +bit_range::dump_to_pp (pretty_printer *pp) const { - binding_key::dump_to_pp (pp, simple); - pp_string (pp, ", start: "); + pp_string (pp, "start: "); pp_wide_int (pp, m_start_bit_offset, SIGNED); pp_string (pp, ", size: "); pp_wide_int (pp, m_size_in_bits, SIGNED); @@ -252,12 +249,34 @@ concrete_binding::dump_to_pp (pretty_printer *pp, bool simple) const pp_wide_int (pp, get_next_bit_offset (), SIGNED); } +int +bit_range::cmp (const bit_range , const bit_range ) +{ + if (int start_cmp = wi::cmps (br1.m_start_bit_offset, + br2.m_start_bit_offset)) +return start_cmp; + + return wi::cmpu (br1.m_size_in_bits, br2.m_size_in_bits); +} + +/* class concrete_binding : public binding_key. */ + +/* Implementation of binding_key::dump_to_pp vfunc for concrete_binding. */ + +void +concrete_binding::dump_to_pp (pretty_printer *pp, bool simple) const +{ + binding_key::dump_to_pp (pp, simple); + pp_string (pp, ", "); + m_bit_range.dump_to_pp (pp); +} + /* Return true if this binding overlaps with OTHER. */ bool concrete_binding::overlaps_p (const concrete_binding ) const { - if (m_start_bit_offset < other.get_next_bit_offset () + if (get_start_bit_offset () < other.get_next_bit_offset () && get_next_bit_offset () > other.get_start_bit_offset ()) return true; return false; @@ -274,10 +293,7 @@ concrete_binding::cmp_ptr_ptr (const void *p1, const void *p2) if (int kind_cmp = b1->get_kind () - b2->get_kind ()) return kind_cmp; - if (int start_cmp = wi::cmps (b1->m_start_bit_offset, b2->m_start_bit_offset)) -return start_cmp; - - return wi::cmpu (b1->m_size_in_bits, b2->m_size_in_bits); + return bit_range::cmp (b1->m_bit_range, b2->m_bit_range); } /* class symbolic_binding : public binding_key. */ diff --git a/gcc/analyzer/store.h b/gcc/analyzer/store.h index d68513ca94c..be09b427366 100644 --- a/gcc/analyzer/store.h +++ b/gcc/analyzer/store.h @@ -267,6 +267,42 @@ private: enum binding_kind m_kind; }; +struct bit_range +{ + bit_range (bit_offset_t start_bit_offset, bit_size_t size_in_bits) + : m_start_bit_offset (start_bit_offset), +m_size_in_bits (size_in_bits) + {} + + void dump_to_pp (pretty_printer *pp) const; + + bit_offset_t get_start_bit_offset () const + { +return m_start_bit_offset; + } + bit_offset_t get_next_bit_offset () const + { +return m_start_bit_offset + m_size_in_bits; + } + + bool contains_p (bit_offset_t offset) const + { +return (offset >= get_start_bit_offset () + && offset < get_next_bit_offset ()); + } + + bool operator== (const bit_range ) const + { +return (m_start_bit_offset == other.m_start_bit_offset + && m_size_in_bits == other.m_size_in_bits); + } + + static int cmp (const bit_range , const bit_range ); + + bit_offset_t m_start_bit_offset; + bit_size_t m_size_in_bits; +}; + /* Concrete subclass of binding_key, for describing a concrete range of bits within the binding_map (e.g. "bits 8-15"). */ @@ -279,24 +315,22 @@ public: concrete_binding (bit_offset_t start_bit_offset, bit_size_t size_in_bits, enum binding_kind kind) : binding_key (kind), -m_start_bit_offset (start_bit_offset), -m_size_in_bits (size_in_bits) +m_bit_range (start_bit_offset, size_in_bits) {} bool concrete_p () const FINAL OVERRIDE { return true; } hashval_t hash () const { inchash::hash hstate; -hstate.add_wide_int (m_start_bit_offset); -hstate.add_wide_int (m_size_in_bits); +hstate.add_wide_int (m_bit_range.m_start_bit_offset); +hstate.add_wide_int
[committed] analyzer: remove redundant typedef
Delete an overzealous copy Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as 8c5a5404cb68e5e39e296849944019b93a591646. gcc/analyzer/ChangeLog: * svalue.h (conjured_svalue::iterator_t): Delete. Signed-off-by: David Malcolm --- gcc/analyzer/svalue.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/gcc/analyzer/svalue.h b/gcc/analyzer/svalue.h index 7fe0ba3a603..d9e34aa6b89 100644 --- a/gcc/analyzer/svalue.h +++ b/gcc/analyzer/svalue.h @@ -1073,8 +1073,6 @@ namespace ana { class conjured_svalue : public svalue { public: - typedef binding_map::iterator_t iterator_t; - /* A support class for uniquifying instances of conjured_svalue. */ struct key_t { -- 2.26.3
[Bug analyzer/99212] [11/12 Regression] gcc.dg/analyzer/data-model-1.c line 971
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99212 --- Comment #15 from CVS Commits --- The master branch has been updated by David Malcolm : https://gcc.gnu.org/g:d3b1ef7a83c0c0cd5b20a1dd1714b868f3d2b442 commit r12-1303-gd3b1ef7a83c0c0cd5b20a1dd1714b868f3d2b442 Author: David Malcolm Date: Tue Jun 8 14:45:57 2021 -0400 analyzer: bitfield fixes [PR99212] This patch verifies the previous fix for bitfield sizes by implementing enough support for bitfields in the analyzer to get the test cases to pass. The patch implements support in the analyzer for reading from a BIT_FIELD_REF, and support for folding BIT_AND_EXPR of a mask, to handle the cases generated in tests. The existing bitfields tests in data-model-1.c turned out to rely on undefined behavior, in that they were assigning values to a signed bitfield that were outside of the valid range of values. I believe that that's why we were seeing target-specific differences in the test results (PR analyzer/99212). The patch updates the test to remove the undefined behaviors. gcc/analyzer/ChangeLog: PR analyzer/99212 * region-model-manager.cc (region_model_manager::maybe_fold_binop): Add support for folding BIT_AND_EXPR of compound_svalue and a mask constant. * region-model.cc (region_model::get_rvalue_1): Implement BIT_FIELD_REF in terms of... (region_model::get_rvalue_for_bits): New function. * region-model.h (region_model::get_rvalue_for_bits): New decl. * store.cc (bit_range::from_mask): New function. (selftest::test_bit_range_intersects_p): New selftest. (selftest::assert_bit_range_from_mask_eq): New. (ASSERT_BIT_RANGE_FROM_MASK_EQ): New macro. (selftest::assert_no_bit_range_from_mask_eq): New. (ASSERT_NO_BIT_RANGE_FROM_MASK): New macro. (selftest::test_bit_range_from_mask): New selftest. (selftest::analyzer_store_cc_tests): Call the new selftests. * store.h (bit_range::intersects_p): New. (bit_range::from_mask): New decl. (concrete_binding::get_bit_range): New accessor. (store_manager::get_concrete_binding): New overload taking const bit_range &. gcc/testsuite/ChangeLog: PR analyzer/99212 * gcc.dg/analyzer/bitfields-1.c: New test. * gcc.dg/analyzer/data-model-1.c (struct sbits): Make bitfields explicitly signed. (test_44): Update test values assigned to the bits to ones that fit in the range of the bitfield type. Remove xfails. (test_45): Remove xfails. Signed-off-by: David Malcolm
[Bug c++/100976] [C++23] Make constexpr reference temp constexpr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100976 --- Comment #1 from Jason Merrill --- constexpr const int = 42; static_assert(r == 42);
[Bug c++/100975] [C++23] Allow pointer to array of auto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100975 --- Comment #1 from Jason Merrill --- int a[3]; auto (*p)[3] =
[Bug rtl-optimization/100978] New: [10/11/12 Regression] ICE: qsort checking failed: qsort comparator non-negative on sorted output: 1 with -O3 -frename-registers -fno-sched-critical-path-heuristic -f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100978 Bug ID: 100978 Summary: [10/11/12 Regression] ICE: qsort checking failed: qsort comparator non-negative on sorted output: 1 with -O3 -frename-registers -fno-sched-critical-path-heuristic -fsched2-use-superblocks Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Created attachment 50966 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50966=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc -O3 -frename-registers -fno-sched-critical-path-heuristic -fsched2-use-superblocks testcase.c testcase.c: In function 'foo': testcase.c:21:1: error: qsort comparator non-negative on sorted output: 1 21 | } | ^ during RTL pass: sched2 testcase.c:21:1: internal compiler error: qsort checking failed 0xa1bf87 qsort_chk_error /repo/gcc-trunk/gcc/vec.c:214 0xa1c093 qsort_chk(void*, unsigned long, unsigned long, int (*)(void const*, void const*, void*), void*) /repo/gcc-trunk/gcc/vec.c:256 0x1d79fb5 gcc_qsort(void*, unsigned long, unsigned long, int (*)(void const*, void const*)) /repo/gcc-trunk/gcc/sort.cc:270 0x1bd83c0 ready_sort_real /repo/gcc-trunk/gcc/haifa-sched.c:3095 0x1be09c5 ready_sort /repo/gcc-trunk/gcc/haifa-sched.c:3111 0x1be09c5 schedule_block(basic_block_def**, void*) /repo/gcc-trunk/gcc/haifa-sched.c:6709 0x1cb32ab schedule_ebb(rtx_insn*, rtx_insn*, bool) /repo/gcc-trunk/gcc/sched-ebb.c:536 0x1cb39d2 schedule_ebbs() /repo/gcc-trunk/gcc/sched-ebb.c:655 0x1015b2c rest_of_handle_sched2 /repo/gcc-trunk/gcc/sched-rgn.c:3740 0x1015b2c execute /repo/gcc-trunk/gcc/sched-rgn.c:3878 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-1295-20210608150918-g7a56d3d3e99-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r12-1295-20210608150918-g7a56d3d3e99-checking-yes-rtl-df-extra-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.0 20210608 (experimental) (GCC)
[PATCH 54/55] rs6000: Test case adjustments
2021-03-24 Bill Schmidt gcc/testsuite/ * gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust. * gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Adjust. * gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Adjust. * gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Adjust. * gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Adjust. * gcc.target/powerpc/bfp/scalar-test-neg-2.c: Adjust. * gcc.target/powerpc/bfp/scalar-test-neg-3.c: Adjust. * gcc.target/powerpc/bfp/scalar-test-neg-5.c: Adjust. * gcc.target/powerpc/byte-in-set-2.c: Adjust. * gcc.target/powerpc/cmpb-2.c: Adjust. * gcc.target/powerpc/cmpb32-2.c: Adjust. * gcc.target/powerpc/crypto-builtin-2.c: Adjust. * gcc.target/powerpc/fold-vec-splat-floatdouble.c: Adjust. * gcc.target/powerpc/fold-vec-splat-longlong.c: Adjust. * gcc.target/powerpc/fold-vec-splat-misc-invalid.c: Adjust. * gcc.target/powerpc/p8vector-builtin-8.c: Adjust. * gcc.target/powerpc/pr80315-1.c: Adjust. * gcc.target/powerpc/pr80315-2.c: Adjust. * gcc.target/powerpc/pr80315-3.c: Adjust. * gcc.target/powerpc/pr80315-4.c: Adjust. * gcc.target/powerpc/pr88100.c: Adjust. * gcc.target/powerpc/pragma_misc9.c: Adjust. * gcc.target/powerpc/pragma_power8.c: Adjust. * gcc.target/powerpc/pragma_power9.c: Adjust. * gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust. * gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Adjust. * gcc.target/powerpc/test_mffsl.c: Adjust. * gcc.target/powerpc/vec-gnb-2.c: Adjust. * gcc.target/powerpc/vsu/vec-all-nez-7.c: Adjust. * gcc.target/powerpc/vsu/vec-any-eqz-7.c: Adjust. * gcc.target/powerpc/vsu/vec-cmpnez-7.c: Adjust. * gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Adjust. * gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Adjust. * gcc.target/powerpc/vsu/vec-xst-len-12.c: Adjust. * gcc.target/powerpc/vsu/vec-xst-len-13.c: Adjust. --- .../gcc.target/powerpc/bfp/scalar-extract-exp-2.c | 2 +- .../gcc.target/powerpc/bfp/scalar-extract-sig-2.c | 2 +- .../gcc.target/powerpc/bfp/scalar-insert-exp-2.c | 2 +- .../gcc.target/powerpc/bfp/scalar-insert-exp-5.c | 2 +- .../gcc.target/powerpc/bfp/scalar-insert-exp-8.c | 2 +- .../gcc.target/powerpc/bfp/scalar-test-neg-2.c | 2 +- .../gcc.target/powerpc/bfp/scalar-test-neg-3.c | 2 +- .../gcc.target/powerpc/bfp/scalar-test-neg-5.c | 2 +- gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/cmpb-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/cmpb32-2.c| 2 +- .../gcc.target/powerpc/crypto-builtin-2.c | 14 +++--- .../powerpc/fold-vec-splat-floatdouble.c | 4 ++-- .../gcc.target/powerpc/fold-vec-splat-longlong.c | 10 +++--- .../powerpc/fold-vec-splat-misc-invalid.c | 8 .../gcc.target/powerpc/p8vector-builtin-8.c| 1 + gcc/testsuite/gcc.target/powerpc/pr80315-1.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr80315-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr80315-3.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr80315-4.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 ++-- gcc/testsuite/gcc.target/powerpc/pragma_misc9.c| 2 +- gcc/testsuite/gcc.target/powerpc/pragma_power8.c | 2 ++ gcc/testsuite/gcc.target/powerpc/pragma_power9.c | 3 +++ .../powerpc/test_fpscr_drn_builtin_error.c | 4 ++-- .../powerpc/test_fpscr_rn_builtin_error.c | 12 ++-- gcc/testsuite/gcc.target/powerpc/test_mffsl.c | 3 ++- gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c | 2 +- .../gcc.target/powerpc/vsu/vec-all-nez-7.c | 2 +- .../gcc.target/powerpc/vsu/vec-any-eqz-7.c | 2 +- .../gcc.target/powerpc/vsu/vec-cmpnez-7.c | 2 +- .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c | 2 +- .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c | 2 +- .../gcc.target/powerpc/vsu/vec-xl-len-13.c | 2 +- .../gcc.target/powerpc/vsu/vec-xst-len-12.c| 2 +- 35 files changed, 62 insertions(+), 59 deletions(-) diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c index 922180675fc..53b67c95cf9 100644 --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c @@ -14,7 +14,7 @@ get_exponent (double *p) { double source = *p; - return scalar_extract_exp (source); /* { dg-error "'__builtin_vec_scalar_extract_exp' is not supported in this compiler configuration" } */ + return scalar_extract_exp (source); /* { dg-error "'__builtin_vsx_scalar_extract_exp' requires the" } */ } diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
[PATCH 55/55] rs6000: Enable the new builtin support
2021-03-05 Bill Schmidt gcc/ * config/rs6000/rs6000-gen-builtins.c (write_init_file): Initialize new_builtins_are_live to 1. --- gcc/config/rs6000/rs6000-gen-builtins.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c b/gcc/config/rs6000/rs6000-gen-builtins.c index c3874e85592..8ca9fdc942a 100644 --- a/gcc/config/rs6000/rs6000-gen-builtins.c +++ b/gcc/config/rs6000/rs6000-gen-builtins.c @@ -2753,7 +2753,7 @@ write_init_file (void) fprintf (init_file, "#include \"rs6000-builtins.h\"\n"); fprintf (init_file, "\n"); - fprintf (init_file, "int new_builtins_are_live = 0;\n\n"); + fprintf (init_file, "int new_builtins_are_live = 1;\n\n"); fprintf (init_file, "tree rs6000_builtin_decls_x[RS6000_OVLD_MAX];\n\n"); -- 2.27.0
[PATCH 53/55] rs6000: Update altivec.h for automated interfaces
2021-04-01 Bill Schmidt gcc/ * config/rs6000/altivec.h: Delete a number of #defines that are now superfluous; include rs6000-vecdefines.h; include some synonyms. --- gcc/config/rs6000/altivec.h | 516 +++- 1 file changed, 41 insertions(+), 475 deletions(-) diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 961621a0841..8daf933e53e 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -55,32 +55,36 @@ #define __CR6_LT 2 #define __CR6_LT_REV 3 -/* Synonyms. */ +#include "rs6000-vecdefines.h" + +/* Deprecated interfaces. */ +#define vec_lvx vec_ld +#define vec_lvxl vec_ldl +#define vec_stvx vec_st +#define vec_stvxl vec_stl #define vec_vaddcuw vec_addc #define vec_vand vec_and #define vec_vandc vec_andc -#define vec_vrfip vec_ceil #define vec_vcmpbfp vec_cmpb #define vec_vcmpgefp vec_cmpge #define vec_vctsxs vec_cts #define vec_vctuxs vec_ctu #define vec_vexptefp vec_expte -#define vec_vrfim vec_floor -#define vec_lvx vec_ld -#define vec_lvxl vec_ldl #define vec_vlogefp vec_loge #define vec_vmaddfp vec_madd #define vec_vmhaddshs vec_madds -#define vec_vmladduhm vec_mladd #define vec_vmhraddshs vec_mradds +#define vec_vmladduhm vec_mladd #define vec_vnmsubfp vec_nmsub #define vec_vnor vec_nor #define vec_vor vec_or -#define vec_vpkpx vec_packpx #define vec_vperm vec_perm -#define vec_permxor __builtin_vec_vpermxor +#define vec_vpkpx vec_packpx #define vec_vrefp vec_re +#define vec_vrfim vec_floor #define vec_vrfin vec_round +#define vec_vrfip vec_ceil +#define vec_vrfiz vec_trunc #define vec_vrsqrtefp vec_rsqrte #define vec_vsel vec_sel #define vec_vsldoi vec_sld @@ -91,438 +95,56 @@ #define vec_vspltisw vec_splat_s32 #define vec_vsr vec_srl #define vec_vsro vec_sro -#define vec_stvx vec_st -#define vec_stvxl vec_stl #define vec_vsubcuw vec_subc #define vec_vsum2sws vec_sum2s #define vec_vsumsws vec_sums -#define vec_vrfiz vec_trunc #define vec_vxor vec_xor +#ifdef _ARCH_PWR8 +#define vec_vclz vec_cntlz +#define vec_vgbbd vec_gb +#define vec_vmrgew vec_mergee +#define vec_vmrgow vec_mergeo +#define vec_vpopcntu vec_popcnt +#define vec_vrld vec_rl +#define vec_vsld vec_sl +#define vec_vsrd vec_sr +#define vec_vsrad vec_sra +#endif + +#ifdef _ARCH_PWR9 +#define vec_extract_fp_from_shorth vec_extract_fp32_from_shorth +#define vec_extract_fp_from_shortl vec_extract_fp32_from_shortl +#define vec_vctz vec_cnttz +#endif + +/* Synonyms. */ /* Functions that are resolved by the backend to one of the typed builtins. */ -#define vec_vaddfp __builtin_vec_vaddfp -#define vec_addc __builtin_vec_addc -#define vec_adde __builtin_vec_adde -#define vec_addec __builtin_vec_addec -#define vec_vaddsws __builtin_vec_vaddsws -#define vec_vaddshs __builtin_vec_vaddshs -#define vec_vaddsbs __builtin_vec_vaddsbs -#define vec_vavgsw __builtin_vec_vavgsw -#define vec_vavguw __builtin_vec_vavguw -#define vec_vavgsh __builtin_vec_vavgsh -#define vec_vavguh __builtin_vec_vavguh -#define vec_vavgsb __builtin_vec_vavgsb -#define vec_vavgub __builtin_vec_vavgub -#define vec_ceil __builtin_vec_ceil -#define vec_cmpb __builtin_vec_cmpb -#define vec_vcmpeqfp __builtin_vec_vcmpeqfp -#define vec_cmpge __builtin_vec_cmpge -#define vec_vcmpgtfp __builtin_vec_vcmpgtfp -#define vec_vcmpgtsw __builtin_vec_vcmpgtsw -#define vec_vcmpgtuw __builtin_vec_vcmpgtuw -#define vec_vcmpgtsh __builtin_vec_vcmpgtsh -#define vec_vcmpgtuh __builtin_vec_vcmpgtuh -#define vec_vcmpgtsb __builtin_vec_vcmpgtsb -#define vec_vcmpgtub __builtin_vec_vcmpgtub -#define vec_vcfsx __builtin_vec_vcfsx -#define vec_vcfux __builtin_vec_vcfux -#define vec_cts __builtin_vec_cts -#define vec_ctu __builtin_vec_ctu -#define vec_cpsgn __builtin_vec_copysign -#define vec_double __builtin_vec_double -#define vec_doublee __builtin_vec_doublee -#define vec_doubleo __builtin_vec_doubleo -#define vec_doublel __builtin_vec_doublel -#define vec_doubleh __builtin_vec_doubleh -#define vec_expte __builtin_vec_expte -#define vec_float __builtin_vec_float -#define vec_float2 __builtin_vec_float2 -#define vec_floate __builtin_vec_floate -#define vec_floato __builtin_vec_floato -#define vec_floor __builtin_vec_floor -#define vec_loge __builtin_vec_loge -#define vec_madd __builtin_vec_madd -#define vec_madds __builtin_vec_madds -#define vec_mtvscr __builtin_vec_mtvscr -#define vec_reve __builtin_vec_vreve -#define vec_vmaxfp __builtin_vec_vmaxfp -#define vec_vmaxsw __builtin_vec_vmaxsw -#define vec_vmaxsh __builtin_vec_vmaxsh -#define vec_vmaxsb __builtin_vec_vmaxsb -#define vec_vminfp __builtin_vec_vminfp -#define vec_vminsw __builtin_vec_vminsw -#define vec_vminsh __builtin_vec_vminsh -#define vec_vminsb __builtin_vec_vminsb -#define vec_mradds __builtin_vec_mradds -#define vec_vmsumshm __builtin_vec_vmsumshm -#define vec_vmsumuhm __builtin_vec_vmsumuhm -#define vec_vmsummbm __builtin_vec_vmsummbm -#define vec_vmsumubm
[PATCH 52/55] rs6000: Debug support
2021-04-01 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (rs6000_debug_type): New function. (def_builtin): Change debug formatting for easier parsing and include more information. (rs6000_init_builtins): Add dump of autogenerated builtins. (altivec_init_builtins): Dump __builtin_altivec_mask_for_load for completeness. --- gcc/config/rs6000/rs6000-call.c | 193 +++- 1 file changed, 189 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index fc61bbc2af5..3a15479f53c 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -8754,6 +8754,106 @@ rs6000_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p, /* Builtins. */ +/* Debug utility to translate a type node to a single token. */ +static +const char *rs6000_debug_type (tree type) +{ + if (type == void_type_node) +return "void"; + else if (type == long_integer_type_node) +return "long"; + else if (type == long_unsigned_type_node) +return "ulong"; + else if (type == long_long_integer_type_node) +return "longlong"; + else if (type == long_long_unsigned_type_node) +return "ulonglong"; + else if (type == bool_V16QI_type_node) +return "vbc"; + else if (type == bool_V2DI_type_node) +return "vbll"; + else if (type == bool_V4SI_type_node) +return "vbi"; + else if (type == bool_V8HI_type_node) +return "vbs"; + else if (type == bool_int_type_node) +return "bool"; + else if (type == dfloat64_type_node) +return "_Decimal64"; + else if (type == double_type_node) +return "double"; + else if (type == intDI_type_node) +return "sll"; + else if (type == intHI_type_node) +return "ss"; + else if (type == ibm128_float_type_node) +return "__ibm128"; + else if (type == opaque_V4SI_type_node) +return "opaque"; + else if (POINTER_TYPE_P (type)) +return "void*"; + else if (type == intQI_type_node || type == char_type_node) +return "sc"; + else if (type == dfloat32_type_node) +return "_Decimal32"; + else if (type == float_type_node) +return "float"; + else if (type == intSI_type_node || type == integer_type_node) +return "si"; + else if (type == dfloat128_type_node) +return "_Decimal128"; + else if (type == long_double_type_node) +return "longdouble"; + else if (type == intTI_type_node) +return "sq"; + else if (type == unsigned_intDI_type_node) +return "ull"; + else if (type == unsigned_intHI_type_node) +return "us"; + else if (type == unsigned_intQI_type_node) +return "uc"; + else if (type == unsigned_intSI_type_node) +return "ui"; + else if (type == unsigned_intTI_type_node) +return "uq"; + else if (type == unsigned_V16QI_type_node) +return "vuc"; + else if (type == unsigned_V1TI_type_node) +return "vuq"; + else if (type == unsigned_V2DI_type_node) +return "vull"; + else if (type == unsigned_V4SI_type_node) +return "vui"; + else if (type == unsigned_V8HI_type_node) +return "vus"; + else if (type == V16QI_type_node) +return "vsc"; + else if (type == V1TI_type_node) +return "vsq"; + else if (type == V2DF_type_node) +return "vd"; + else if (type == V2DI_type_node) +return "vsll"; + else if (type == V4SF_type_node) +return "vf"; + else if (type == V4SI_type_node) +return "vsi"; + else if (type == V8HI_type_node) +return "vss"; + else if (type == pixel_V8HI_type_node) +return "vp"; + else if (type == pcvoid_type_node) +return "voidc*"; + else if (type == float128_type_node) +return "_Float128"; + else if (type == vector_pair_type_node) +return "__vector_pair"; + else if (type == vector_quad_type_node) +return "__vector_quad"; + else +return "unknown"; +} + static void def_builtin (const char *name, tree type, enum rs6000_builtins code) { @@ -8782,7 +8882,7 @@ def_builtin (const char *name, tree type, enum rs6000_builtins code) /* const function, function only depends on the inputs. */ TREE_READONLY (t) = 1; TREE_NOTHROW (t) = 1; - attr_string = ", const"; + attr_string = "= const"; } else if ((classify & RS6000_BTC_PURE) != 0) { @@ -8790,7 +8890,7 @@ def_builtin (const char *name, tree type, enum rs6000_builtins code) external state. */ DECL_PURE_P (t) = 1; TREE_NOTHROW (t) = 1; - attr_string = ", pure"; + attr_string = "= pure"; } else if ((classify & RS6000_BTC_FP) != 0) { @@ -8804,12 +8904,12 @@ def_builtin (const char *name, tree type, enum rs6000_builtins code) { DECL_PURE_P (t) = 1; DECL_IS_NOVOPS (t) = 1; - attr_string = ", fp, pure"; + attr_string = "= fp, pure"; } else { TREE_READONLY (t) = 1; - attr_string = ", fp, const"; + attr_string = "= fp, const"; } }
[PATCH 51/55] rs6000: Miscellaneous uses of rs6000_builtin_decls_x
2021-03-05 Bill Schmidt gcc/ * config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Use rs6000_builtin_decls_x when appropriate. (add_condition_to_bb): Likewise. (rs6000_atomic_assign_expand_fenv): Likewise. --- gcc/config/rs6000/rs6000.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 9179c73f43c..db6a65a7917 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -22758,12 +22758,16 @@ rs6000_builtin_reciprocal (tree fndecl) if (!RS6000_RECIP_AUTO_RSQRTE_P (V2DFmode)) return NULL_TREE; + if (new_builtins_are_live) + return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_2DF]; return rs6000_builtin_decls[VSX_BUILTIN_RSQRT_2DF]; case VSX_BUILTIN_XVSQRTSP: if (!RS6000_RECIP_AUTO_RSQRTE_P (V4SFmode)) return NULL_TREE; + if (new_builtins_are_live) + return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_4SF]; return rs6000_builtin_decls[VSX_BUILTIN_RSQRT_4SF]; default: @@ -25352,7 +25356,10 @@ add_condition_to_bb (tree function_decl, tree version_decl, tree bool_zero = build_int_cst (bool_int_type_node, 0); tree cond_var = create_tmp_var (bool_int_type_node); - tree predicate_decl = rs6000_builtin_decls [(int) RS6000_BUILTIN_CPU_SUPPORTS]; + tree predicate_decl += (new_builtins_are_live + ? rs6000_builtin_decls_x[(int) RS6000_BIF_CPU_SUPPORTS] + : rs6000_builtin_decls [(int) RS6000_BUILTIN_CPU_SUPPORTS]); const char *arg_str = rs6000_clone_map[clone_isa].name; tree predicate_arg = build_string_literal (strlen (arg_str) + 1, arg_str); gimple *call_cond_stmt = gimple_build_call (predicate_decl, 1, predicate_arg); @@ -27577,8 +27584,14 @@ rs6000_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) return; } - tree mffs = rs6000_builtin_decls[RS6000_BUILTIN_MFFS]; - tree mtfsf = rs6000_builtin_decls[RS6000_BUILTIN_MTFSF]; + tree mffs += (new_builtins_are_live + ? rs6000_builtin_decls_x[RS6000_BIF_MFFS] + : rs6000_builtin_decls[RS6000_BUILTIN_MFFS]); + tree mtfsf += (new_builtins_are_live + ? rs6000_builtin_decls_x[RS6000_BIF_MTFSF] + : rs6000_builtin_decls[RS6000_BUILTIN_MTFSF]); tree call_mffs = build_call_expr (mffs, 0); /* Generates the equivalent of feholdexcept (_var) -- 2.27.0
[PATCH 42/55] rs6000: Handle gimple folding of target built-ins
This is another patch that looks bigger than it really is. Because we have a new namespace for the builtins, allowing us to have both the old and new builtin infrastructure supported at once, we need versions of these functions that use the new builtin namespace. Otherwise the code is unchanged. 2021-06-07 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin): New forward decl. (rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin. (rs6000_new_builtin_valid_without_lhs): New function. (rs6000_gimple_fold_new_mma_builtin): Likewise. (rs6000_gimple_fold_new_builtin): Likewise. --- gcc/config/rs6000/rs6000-call.c | 1152 +++ 1 file changed, 1152 insertions(+) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 8f6b6b462f8..1bb9f1c255d 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, machine_mode, static void rs6000_common_init_builtins (void); static void htm_init_builtins (void); static void mma_init_builtins (void); +static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi); /* Hash table to keep track of the argument types for builtin functions. */ @@ -11855,6 +11856,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi) bool rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) { + if (new_builtins_are_live) +return rs6000_gimple_fold_new_builtin (gsi); + gimple *stmt = gsi_stmt (*gsi); tree fndecl = gimple_call_fndecl (stmt); gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD); @@ -12794,6 +12798,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) return false; } +/* Helper function to sort out which built-ins may be valid without having +a LHS. */ +static bool +rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code, + tree fndecl) +{ + if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node) +return true; + + switch (fn_code) +{ +case RS6000_BIF_STVX_V16QI: +case RS6000_BIF_STVX_V8HI: +case RS6000_BIF_STVX_V4SI: +case RS6000_BIF_STVX_V4SF: +case RS6000_BIF_STVX_V2DI: +case RS6000_BIF_STVX_V2DF: +case RS6000_BIF_STXVW4X_V16QI: +case RS6000_BIF_STXVW4X_V8HI: +case RS6000_BIF_STXVW4X_V4SF: +case RS6000_BIF_STXVW4X_V4SI: +case RS6000_BIF_STXVD2X_V2DF: +case RS6000_BIF_STXVD2X_V2DI: + return true; +default: + return false; +} +} + /* Check whether a builtin function is supported in this target configuration. */ bool @@ -12885,6 +12918,1125 @@ rs6000_new_builtin_is_supported_p (enum rs6000_gen_builtins fncode) return true; } +/* Expand the MMA built-ins early, so that we can convert the pass-by-reference + __vector_quad arguments into pass-by-value arguments, leading to more + efficient code generation. */ +static bool +rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi, + rs6000_gen_builtins fn_code) +{ + gimple *stmt = gsi_stmt (*gsi); + size_t fncode = (size_t) fn_code; + + if (!bif_is_mma (rs6000_builtin_info_x[fncode])) +return false; + + /* Each call that can be gimple-expanded has an associated built-in + function that it will expand into. If this one doesn't, we have + already expanded it! */ + if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE) +return false; + + bifdata *bd = _builtin_info_x[fncode]; + unsigned nopnds = bd->nargs; + gimple_seq new_seq = NULL; + gimple *new_call; + tree new_decl; + + /* Compatibility built-ins; we used to call these + __builtin_mma_{dis,}assemble_pair, but now we call them + __builtin_vsx_{dis,}assemble_pair. Handle the old verions. */ + if (fncode == RS6000_BIF_ASSEMBLE_PAIR) +fncode = RS6000_BIF_ASSEMBLE_PAIR_V; + else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR) +fncode = RS6000_BIF_DISASSEMBLE_PAIR_V; + + if (fncode == RS6000_BIF_DISASSEMBLE_ACC + || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V) +{ + /* This is an MMA disassemble built-in function. */ + push_gimplify_context (true); + unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2; + tree dst_ptr = gimple_call_arg (stmt, 0); + tree src_ptr = gimple_call_arg (stmt, 1); + tree src_type = TREE_TYPE (src_ptr); + tree src = make_ssa_name (TREE_TYPE (src_type)); + gimplify_assign (src, build_simple_mem_ref (src_ptr), _seq); + + /* If we are not disassembling an accumulator/pair or our destination is +another accumulator/pair, then just copy the entire thing as is. */ + if ((fncode == RS6000_BIF_DISASSEMBLE_ACC + && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_quad_type_node) + || (fncode == RS6000_BIF_DISASSEMBLE_PAIR_V +
[PATCH 49/55] rs6000: Builtin expansion, part 6
2021-03-24 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (new_htm_spr_num): New function. (new_htm_expand_builtin): Implement. (rs6000_expand_new_builtin): Handle 32-bit and endian cases. --- gcc/config/rs6000/rs6000-call.c | 202 1 file changed, 202 insertions(+) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index f4b0c00aab4..53e51b17ab3 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -14911,11 +14911,171 @@ new_mma_expand_builtin (tree exp, rtx target, insn_code icode) return target; } +/* Return the appropriate SPR number associated with the given builtin. */ +static inline HOST_WIDE_INT +new_htm_spr_num (enum rs6000_gen_builtins code) +{ + if (code == RS6000_BIF_GET_TFHAR + || code == RS6000_BIF_SET_TFHAR) +return TFHAR_SPR; + else if (code == RS6000_BIF_GET_TFIAR + || code == RS6000_BIF_SET_TFIAR) +return TFIAR_SPR; + else if (code == RS6000_BIF_GET_TEXASR + || code == RS6000_BIF_SET_TEXASR) +return TEXASR_SPR; + gcc_assert (code == RS6000_BIF_GET_TEXASRU + || code == RS6000_BIF_SET_TEXASRU); + return TEXASRU_SPR; +} + /* Expand the HTM builtin in EXP and store the result in TARGET. */ static rtx new_htm_expand_builtin (bifdata *bifaddr, rs6000_gen_builtins fcode, tree exp, rtx target) { + tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); + bool nonvoid = TREE_TYPE (TREE_TYPE (fndecl)) != void_type_node; + + if (!TARGET_POWERPC64 + && (fcode == RS6000_BIF_TABORTDC + || fcode == RS6000_BIF_TABORTDCI)) +{ + error ("builtin %qs is only valid in 64-bit mode", bifaddr->bifname); + return const0_rtx; +} + + rtx op[MAX_HTM_OPERANDS], pat; + int nopnds = 0; + tree arg; + call_expr_arg_iterator iter; + insn_code icode = bifaddr->icode; + bool uses_spr = bif_is_htmspr (*bifaddr); + rtx cr = NULL_RTX; + + if (uses_spr) +icode = rs6000_htm_spr_icode (nonvoid); + const insn_operand_data *insn_op = _data[icode].operand[0]; + + if (nonvoid) +{ + machine_mode tmode = (uses_spr) ? insn_op->mode : E_SImode; + if (!target + || GET_MODE (target) != tmode + || (uses_spr && !(*insn_op->predicate) (target, tmode))) + target = gen_reg_rtx (tmode); + if (uses_spr) + op[nopnds++] = target; +} + + FOR_EACH_CALL_EXPR_ARG (arg, iter, exp) +{ + if (arg == error_mark_node || nopnds >= MAX_HTM_OPERANDS) + return const0_rtx; + + insn_op = _data[icode].operand[nopnds]; + op[nopnds] = expand_normal (arg); + + if (!(*insn_op->predicate) (op[nopnds], insn_op->mode)) + { + if (!strcmp (insn_op->constraint, "n")) + { + int arg_num = (nonvoid) ? nopnds : nopnds + 1; + if (!CONST_INT_P (op[nopnds])) + error ("argument %d must be an unsigned literal", arg_num); + else + error ("argument %d is an unsigned literal that is " + "out of range", arg_num); + return const0_rtx; + } + op[nopnds] = copy_to_mode_reg (insn_op->mode, op[nopnds]); + } + + nopnds++; +} + + /* Handle the builtins for extended mnemonics. These accept + no arguments, but map to builtins that take arguments. */ + switch (fcode) +{ +case RS6000_BIF_TENDALL: /* Alias for: tend. 1 */ +case RS6000_BIF_TRESUME: /* Alias for: tsr. 1 */ + op[nopnds++] = GEN_INT (1); + break; +case RS6000_BIF_TSUSPEND: /* Alias for: tsr. 0 */ + op[nopnds++] = GEN_INT (0); + break; +default: + break; +} + + /* If this builtin accesses SPRs, then pass in the appropriate + SPR number and SPR regno as the last two operands. */ + if (uses_spr) +{ + machine_mode mode = (TARGET_POWERPC64) ? DImode : SImode; + op[nopnds++] = gen_rtx_CONST_INT (mode, new_htm_spr_num (fcode)); +} + /* If this builtin accesses a CR, then pass in a scratch + CR as the last operand. */ + else if (bif_is_htmcr (*bifaddr)) +{ + cr = gen_reg_rtx (CCmode); + op[nopnds++] = cr; +} + + switch (nopnds) +{ +case 1: + pat = GEN_FCN (icode) (op[0]); + break; +case 2: + pat = GEN_FCN (icode) (op[0], op[1]); + break; +case 3: + pat = GEN_FCN (icode) (op[0], op[1], op[2]); + break; +case 4: + pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]); + break; +default: + gcc_unreachable (); +} + if (!pat) +return NULL_RTX; + emit_insn (pat); + + if (bif_is_htmcr (*bifaddr)) +{ + if (fcode == RS6000_BIF_TBEGIN) + { + /* Emit code to set TARGET to true or false depending on +whether the tbegin. instruction successfully or failed +to start a transaction. We do this by placing the 1's +complement of CR's
[PATCH 50/55] rs6000: Update rs6000_builtin_decl
2021-03-05 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (rs6000_new_builtin_decl): New function. (rs6000_builtin_decl): Call it. --- gcc/config/rs6000/rs6000-call.c | 20 1 file changed, 20 insertions(+) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 53e51b17ab3..fc61bbc2af5 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -16095,11 +16095,31 @@ rs6000_init_builtins (void) } } +static tree +rs6000_new_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED) +{ + rs6000_gen_builtins fcode = (rs6000_gen_builtins) code; + + if (fcode >= RS6000_OVLD_MAX) +return error_mark_node; + + if (!rs6000_new_builtin_is_supported_p (fcode)) +{ + rs6000_invalid_new_builtin (fcode); + return error_mark_node; +} + + return rs6000_builtin_decls_x[code]; +} + /* Returns the rs6000 builtin decl for CODE. */ tree rs6000_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED) { + if (new_builtins_are_live) +return rs6000_new_builtin_decl (code, initialize_p); + HOST_WIDE_INT fnmask; if (code >= RS6000_BUILTIN_COUNT) -- 2.27.0
[PATCH 48/55] rs6000: Builtin expansion, part 5
2021-03-25 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (new_mma_expand_builtin): Implement. --- gcc/config/rs6000/rs6000-call.c | 92 + 1 file changed, 92 insertions(+) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 00fd4bb95ab..f4b0c00aab4 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -14816,6 +14816,98 @@ stv_expand_builtin (insn_code icode, rtx *op, static rtx new_mma_expand_builtin (tree exp, rtx target, insn_code icode) { + tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); + tree arg; + call_expr_arg_iterator iter; + const struct insn_operand_data *insn_op; + rtx op[MAX_MMA_OPERANDS]; + unsigned nopnds = 0; + bool void_func = TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node; + machine_mode tmode = VOIDmode; + + if (!void_func) +{ + tmode = insn_data[icode].operand[0].mode; + if (!target + || GET_MODE (target) != tmode + || !(*insn_data[icode].operand[0].predicate) (target, tmode)) + target = gen_reg_rtx (tmode); + op[nopnds++] = target; +} + else +target = const0_rtx; + + FOR_EACH_CALL_EXPR_ARG (arg, iter, exp) +{ + if (arg == error_mark_node) + return const0_rtx; + + rtx opnd; + insn_op = _data[icode].operand[nopnds]; + if (TREE_CODE (arg) == ADDR_EXPR + && MEM_P (DECL_RTL (TREE_OPERAND (arg, 0 + opnd = DECL_RTL (TREE_OPERAND (arg, 0)); + else + opnd = expand_normal (arg); + + if (!(*insn_op->predicate) (opnd, insn_op->mode)) + { + if (!strcmp (insn_op->constraint, "n")) + { + if (!CONST_INT_P (opnd)) + error ("argument %d must be an unsigned literal", nopnds); + else + error ("argument %d is an unsigned literal that is " + "out of range", nopnds); + return const0_rtx; + } + opnd = copy_to_mode_reg (insn_op->mode, opnd); + } + + /* Some MMA instructions have INOUT accumulator operands, so force +their target register to be the same as their input register. */ + if (!void_func + && nopnds == 1 + && !strcmp (insn_op->constraint, "0") + && insn_op->mode == tmode + && REG_P (opnd) + && (*insn_data[icode].operand[0].predicate) (opnd, tmode)) + target = op[0] = opnd; + + op[nopnds++] = opnd; +} + + rtx pat; + switch (nopnds) +{ +case 1: + pat = GEN_FCN (icode) (op[0]); + break; +case 2: + pat = GEN_FCN (icode) (op[0], op[1]); + break; +case 3: + pat = GEN_FCN (icode) (op[0], op[1], op[2]); + break; +case 4: + pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]); + break; +case 5: + pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4]); + break; +case 6: + pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5]); + break; +case 7: + pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5], op[6]); + break; +default: + gcc_unreachable (); +} + if (!pat) +return NULL_RTX; + emit_insn (pat); + return target; } -- 2.27.0
[PATCH 47/55] rs6000: Builtin expansion, part 4
2021-03-05 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (elemrev_icode): Implement. (ldv_expand_builtin): Likewise. (lxvrse_expand_builtin): Likewise. (lxvrze_expand_builtin): Likewise. (stv_expand_builtin): Likewise. --- gcc/config/rs6000/rs6000-call.c | 217 1 file changed, 217 insertions(+) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index c1c936f62b7..00fd4bb95ab 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -14565,12 +14565,114 @@ new_cpu_expand_builtin (enum rs6000_gen_builtins fcode, static insn_code elemrev_icode (rs6000_gen_builtins fcode) { + switch (fcode) +{ +default: + gcc_unreachable (); +case RS6000_BIF_ST_ELEMREV_V1TI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v1ti + : CODE_FOR_vsx_st_elemrev_v1ti); +case RS6000_BIF_ST_ELEMREV_V2DF: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2df + : CODE_FOR_vsx_st_elemrev_v2df); +case RS6000_BIF_ST_ELEMREV_V2DI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2di + : CODE_FOR_vsx_st_elemrev_v2di); +case RS6000_BIF_ST_ELEMREV_V4SF: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4sf + : CODE_FOR_vsx_st_elemrev_v4sf); +case RS6000_BIF_ST_ELEMREV_V4SI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4si + : CODE_FOR_vsx_st_elemrev_v4si); +case RS6000_BIF_ST_ELEMREV_V8HI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v8hi + : CODE_FOR_vsx_st_elemrev_v8hi); +case RS6000_BIF_ST_ELEMREV_V16QI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v16qi + : CODE_FOR_vsx_st_elemrev_v16qi); +case RS6000_BIF_LD_ELEMREV_V2DF: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2df + : CODE_FOR_vsx_ld_elemrev_v2df); +case RS6000_BIF_LD_ELEMREV_V1TI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v1ti + : CODE_FOR_vsx_ld_elemrev_v1ti); +case RS6000_BIF_LD_ELEMREV_V2DI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2di + : CODE_FOR_vsx_ld_elemrev_v2di); +case RS6000_BIF_LD_ELEMREV_V4SF: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4sf + : CODE_FOR_vsx_ld_elemrev_v4sf); +case RS6000_BIF_LD_ELEMREV_V4SI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4si + : CODE_FOR_vsx_ld_elemrev_v4si); +case RS6000_BIF_LD_ELEMREV_V8HI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v8hi + : CODE_FOR_vsx_ld_elemrev_v8hi); +case RS6000_BIF_LD_ELEMREV_V16QI: + return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v16qi + : CODE_FOR_vsx_ld_elemrev_v16qi); +} + gcc_unreachable (); return (insn_code) 0; } static rtx ldv_expand_builtin (rtx target, insn_code icode, rtx *op, machine_mode tmode) { + rtx pat, addr; + bool blk = (icode == CODE_FOR_altivec_lvlx + || icode == CODE_FOR_altivec_lvlxl + || icode == CODE_FOR_altivec_lvrx + || icode == CODE_FOR_altivec_lvrxl); + + if (target == 0 + || GET_MODE (target) != tmode + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) +target = gen_reg_rtx (tmode); + + op[1] = copy_to_mode_reg (Pmode, op[1]); + + /* For LVX, express the RTL accurately by ANDing the address with -16. + LVXL and LVE*X expand to use UNSPECs to hide their special behavior, + so the raw address is fine. */ + if (icode == CODE_FOR_altivec_lvx_v1ti + || icode == CODE_FOR_altivec_lvx_v2df + || icode == CODE_FOR_altivec_lvx_v2di + || icode == CODE_FOR_altivec_lvx_v4sf + || icode == CODE_FOR_altivec_lvx_v4si + || icode == CODE_FOR_altivec_lvx_v8hi + || icode == CODE_FOR_altivec_lvx_v16qi) +{ + rtx rawaddr; + if (op[0] == const0_rtx) + rawaddr = op[1]; + else + { + op[0] = copy_to_mode_reg (Pmode, op[0]); + rawaddr = gen_rtx_PLUS (Pmode, op[1], op[0]); + } + addr = gen_rtx_AND (Pmode, rawaddr, gen_rtx_CONST_INT (Pmode, -16)); + addr = gen_rtx_MEM (blk ? BLKmode : tmode, addr); + + emit_insn (gen_rtx_SET (target, addr)); +} + else +{ + if (op[0] == const0_rtx) + addr = gen_rtx_MEM (blk ? BLKmode : tmode, op[1]); + else + { + op[0] = copy_to_mode_reg (Pmode, op[0]); + addr = gen_rtx_MEM (blk ? BLKmode : tmode, + gen_rtx_PLUS (Pmode, op[1], op[0])); + } + + pat = GEN_FCN (icode) (target, addr); + if (! pat) + return 0; + emit_insn (pat); +} + return target; } @@ -14578,6 +14680,42 @@ static rtx lxvrse_expand_builtin (rtx target, insn_code icode, rtx *op, machine_mode tmode, machine_mode smode) { + rtx pat, addr; + op[1] = copy_to_mode_reg (Pmode, op[1]); + + if (op[0] == const0_rtx) +addr =
[PATCH 46/55] rs6000: Builtin expansion, part 3
2021-03-05 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (new_cpu_expand_builtin): Implement. --- gcc/config/rs6000/rs6000-call.c | 100 1 file changed, 100 insertions(+) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index dd24e808c97..c1c936f62b7 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -14459,6 +14459,106 @@ static rtx new_cpu_expand_builtin (enum rs6000_gen_builtins fcode, tree exp ATTRIBUTE_UNUSED, rtx target) { + /* __builtin_cpu_init () is a nop, so expand to nothing. */ + if (fcode == RS6000_BIF_CPU_INIT) +return const0_rtx; + + if (target == 0 || GET_MODE (target) != SImode) +target = gen_reg_rtx (SImode); + +#ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB + tree arg = TREE_OPERAND (CALL_EXPR_ARG (exp, 0), 0); + /* Target clones creates an ARRAY_REF instead of STRING_CST, convert it back + to a STRING_CST. */ + if (TREE_CODE (arg) == ARRAY_REF + && TREE_CODE (TREE_OPERAND (arg, 0)) == STRING_CST + && TREE_CODE (TREE_OPERAND (arg, 1)) == INTEGER_CST + && compare_tree_int (TREE_OPERAND (arg, 1), 0) == 0) +arg = TREE_OPERAND (arg, 0); + + if (TREE_CODE (arg) != STRING_CST) +{ + error ("builtin %qs only accepts a string argument", +rs6000_builtin_info_x[(size_t) fcode].bifname); + return const0_rtx; +} + + if (fcode == RS6000_BIF_CPU_IS) +{ + const char *cpu = TREE_STRING_POINTER (arg); + rtx cpuid = NULL_RTX; + for (size_t i = 0; i < ARRAY_SIZE (cpu_is_info); i++) + if (strcmp (cpu, cpu_is_info[i].cpu) == 0) + { + /* The CPUID value in the TCB is offset by _DL_FIRST_PLATFORM. */ + cpuid = GEN_INT (cpu_is_info[i].cpuid + _DL_FIRST_PLATFORM); + break; + } + if (cpuid == NULL_RTX) + { + /* Invalid CPU argument. */ + error ("cpu %qs is an invalid argument to builtin %qs", +cpu, rs6000_builtin_info_x[(size_t) fcode].bifname); + return const0_rtx; + } + + rtx platform = gen_reg_rtx (SImode); + rtx tcbmem = gen_const_mem (SImode, + gen_rtx_PLUS (Pmode, + gen_rtx_REG (Pmode, TLS_REGNUM), + GEN_INT (TCB_PLATFORM_OFFSET))); + emit_move_insn (platform, tcbmem); + emit_insn (gen_eqsi3 (target, platform, cpuid)); +} + else if (fcode == RS6000_BIF_CPU_SUPPORTS) +{ + const char *hwcap = TREE_STRING_POINTER (arg); + rtx mask = NULL_RTX; + int hwcap_offset; + for (size_t i = 0; i < ARRAY_SIZE (cpu_supports_info); i++) + if (strcmp (hwcap, cpu_supports_info[i].hwcap) == 0) + { + mask = GEN_INT (cpu_supports_info[i].mask); + hwcap_offset = TCB_HWCAP_OFFSET (cpu_supports_info[i].id); + break; + } + if (mask == NULL_RTX) + { + /* Invalid HWCAP argument. */ + error ("%s %qs is an invalid argument to builtin %qs", +"hwcap", hwcap, +rs6000_builtin_info_x[(size_t) fcode].bifname); + return const0_rtx; + } + + rtx tcb_hwcap = gen_reg_rtx (SImode); + rtx tcbmem = gen_const_mem (SImode, + gen_rtx_PLUS (Pmode, + gen_rtx_REG (Pmode, TLS_REGNUM), + GEN_INT (hwcap_offset))); + emit_move_insn (tcb_hwcap, tcbmem); + rtx scratch1 = gen_reg_rtx (SImode); + emit_insn (gen_rtx_SET (scratch1, gen_rtx_AND (SImode, tcb_hwcap, mask))); + rtx scratch2 = gen_reg_rtx (SImode); + emit_insn (gen_eqsi3 (scratch2, scratch1, const0_rtx)); + emit_insn (gen_rtx_SET (target, gen_rtx_XOR (SImode, scratch2, const1_rtx))); +} + else +gcc_unreachable (); + + /* Record that we have expanded a CPU builtin, so that we can later + emit a reference to the special symbol exported by LIBC to ensure we + do not link against an old LIBC that doesn't support this feature. */ + cpu_builtin_p = true; + +#else + warning (0, "builtin %qs needs GLIBC (2.23 and newer) that exports hardware " + "capability bits", rs6000_builtin_info_x[(size_t) fcode].bifname); + + /* For old LIBCs, always return FALSE. */ + emit_move_insn (target, GEN_INT (0)); +#endif /* TARGET_LIBC_PROVIDES_HWCAP_IN_TCB */ + return target; } -- 2.27.0
[PATCH 45/55] rs6000: Builtin expansion, part 2
2021-03-05 Bill Schmidt gcc/ * config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin): Implement. (rs6000_expand_ldst_mask): Likewise. (rs6000_init_builtins): Initialize altivec_builtin_mask_for_load. --- gcc/config/rs6000/rs6000-call.c | 101 +++- 1 file changed, 100 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 9493beca0ae..dd24e808c97 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -11534,6 +11534,75 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode) static void rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode) { + size_t uns_fncode = (size_t) fncode; + const char *name = rs6000_builtin_info_x[uns_fncode].bifname; + + switch (rs6000_builtin_info_x[uns_fncode].enable) +{ +case ENB_P5: + error ("%qs requires the %qs option", name, "-mcpu=power5"); + break; +case ENB_P6: + error ("%qs requires the %qs option", name, "-mcpu=power6"); + break; +case ENB_ALTIVEC: + error ("%qs requires the %qs option", name, "-maltivec"); + break; +case ENB_CELL: + error ("%qs is only valid for the cell processor", name); + break; +case ENB_VSX: + error ("%qs requires the %qs option", name, "-mvsx"); + break; +case ENB_P7: + error ("%qs requires the %qs option", name, "-mcpu=power7"); + break; +case ENB_P7_64: + error ("%qs requires the %qs option and either the %qs or %qs option", +name, "-mcpu=power7", "-m64", "-mpowerpc64"); + break; +case ENB_P8: + error ("%qs requires the %qs option", name, "-mcpu=power8"); + break; +case ENB_P8V: + error ("%qs requires the %qs option", name, "-mpower8-vector"); + break; +case ENB_P9: + error ("%qs requires the %qs option", name, "-mcpu=power9"); + break; +case ENB_P9_64: + error ("%qs requires the %qs option and either the %qs or %qs option", +name, "-mcpu=power9", "-m64", "-mpowerpc64"); + break; +case ENB_P9V: + error ("%qs requires the %qs option", name, "-mpower9-vector"); + break; +case ENB_IEEE128_HW: + error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name); + break; +case ENB_DFP: + error ("%qs requires the %qs option", name, "-mhard-dfp"); + break; +case ENB_CRYPTO: + error ("%qs requires the %qs option", name, "-mcrypto"); + break; +case ENB_HTM: + error ("%qs requires the %qs option", name, "-mhtm"); + break; +case ENB_P10: + error ("%qs requires the %qs option", name, "-mcpu=power10"); + break; +case ENB_P10_64: + error ("%qs requires the %qs option and either the %qs or %qs option", +name, "-mcpu=power10", "-m64", "-mpowerpc64"); + break; +case ENB_MMA: + error ("%qs requires the %qs option", name, "-mmma"); + break; +default: +case ENB_ALWAYS: + gcc_unreachable (); +}; } /* Target hook for early folding of built-ins, shamelessly stolen @@ -14356,7 +14425,33 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, rtx rs6000_expand_ldst_mask (rtx target, tree arg0) { - return target; + int icode2 = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr_direct + : (int) CODE_FOR_altivec_lvsl_direct); + machine_mode tmode = insn_data[icode2].operand[0].mode; + machine_mode mode = insn_data[icode2].operand[1].mode; + rtx op, addr, pat; + + gcc_assert (TARGET_ALTIVEC); + + gcc_assert (POINTER_TYPE_P (TREE_TYPE (arg0))); + op = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL); + addr = memory_address (mode, op); + /* We need to negate the address. */ + op = gen_reg_rtx (GET_MODE (addr)); + emit_insn (gen_rtx_SET (op, gen_rtx_NEG (GET_MODE (addr), addr))); + op = gen_rtx_MEM (mode, op); + + if (target == 0 + || GET_MODE (target) != tmode + || ! (*insn_data[icode2].operand[0].predicate) (target, tmode)) +target = gen_reg_rtx (tmode); + + pat = GEN_FCN (icode2) (target, op); + if (!pat) +return 0; + emit_insn (pat); + + return target; } /* Expand the CPU builtin in FCODE and store the result in TARGET. */ @@ -15249,6 +15344,10 @@ rs6000_init_builtins (void) /* Execute the autogenerated initialization code for builtins. */ rs6000_autoinit_builtins (); + if (new_builtins_are_live) +altivec_builtin_mask_for_load + = rs6000_builtin_decls_x[RS6000_BIF_MASK_FOR_LOAD]; + if (new_builtins_are_live) { #ifdef SUBTARGET_INIT_BUILTINS -- 2.27.0