[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 --- Comment #8 from Dmitry Vyukov dvyukov at google dot com --- Kostya, Alexey, Eugeniy, please land this fix to llvm tree while I am OOO.
Re: [patch, libgomp] Re-factor GOMP_MAP_POINTER handling
Ping x2. On 15/5/11 7:19 PM, Chung-Lin Tang wrote: Ping. On 2015/4/21 08:21 PM, Chung-Lin Tang wrote: Hi, while investigating some issues in the variable mapping code, I observed that the GOMP_MAP_POINTER handling is essentially duplicated under the PSET case. This patch abstracts and unifies the handling code, basically just a cleanup patch. Ran libgomp tests to ensure no regressions, ok for trunk? Thanks, Chung-Lin 2015-04-21 Chung-Lin Tang clt...@codesourcery.com libgomp/ * target.c (gomp_map_pointer): New function abstracting out GOMP_MAP_POINTER handling. (gomp_map_vars): Remove GOMP_MAP_POINTER handling code and use gomp_map_pointer().
[Bug c++/66234] New: Too much output from pragma message with g++ 4.8 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66234 Bug ID: 66234 Summary: Too much output from pragma message with g++ 4.8 and above Product: gcc Version: 4.8.3 Status: UNCONFIRMED Severity: minor Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: simon at newtec dot dk Target Milestone: --- Created attachment 35584 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35584action=edit Quick test, just compile with g++ filename I'm using pragma message to output a couple of messages during compilation, but after a toolchain upgrade (from OSELAS-2012 to 2014, and thus from gcc-4.7.2 to gcc-4.9.2 - but I've also recreated this on my host with gcc-4.6.4 which is working fine, and gcc-4.8.3 which exhibits the problem), a lot of extra output is shown. A quick test is attached. Compiling with 4.7.2 / 4.6.4 gives the file location and: note: #pragma message: Setting builddate to: (May 21 2015 10:33:13) - which is just what I want. Compiling with 4.9.2 / 4.8.3 gives (file locations removed): note: #pragma message: Setting builddate to: (May 21 2015 10:34:51) #pragma message Setting builddate to: STR(BUILDTAG) ^ note: in definition of macro 'STR_HELPER' #define STR_HELPER(x) #x ^ note: in expansion of macro 'STR' #pragma message Setting builddate to: STR(BUILDTAG) ^ The output that I want is still there, but together with a lot of extra clutter. I can hide the extra output with -ftrack-macro-expansion=0 and -fno-diagnostics-show-caret, but I suspect that that may also in some cases hide compilation messages that I *do* want to see. As I see it, pragma message should (by default) just output the desired message and nothing else - or is it just that I'm using it the wrong way? A related post on stackoverflow can be found here: http://stackoverflow.com/questions/30255294/how-to-hide-extra-output-from-pragma-message
[Bug target/66235] New: [SH] Optimize tst reg,const movrt sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66235 Bug ID: 66235 Summary: [SH] Optimize tst reg,const movrt sequence Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: olegendo at gcc dot gnu.org Target Milestone: --- Target: sh*-*-* The following example: bool test_00 (unsigned int x) { return (x 0x1042A) != 0; } compiled with -O2 -m4: mov.l .L3,r1 tst r1,r4 mov #-1,r1 rts negcr1,r0 .L4: .align 2 .L3: .long 0x1042A On non-SH2A there is no movrt insn and a !T - reg move is done via negc. If the preceeding tst insn uses a constant, the constant can be complemented to avoid the negc and #-1 constant load: mov.l .L3,r1 tst r1,r4 rts movtr0 .L4: .align 2 .L3: .long 0xFFFEFBD5 The tstsi_t splitter could be extended to look for a following movrt insn and optimize it accordingly. The downside of doing this is an increased constant pool size, if the original (non-complemented) constant is used for something else. Moreover, it is only beneficial to do this if there are no other negc #-1 movrt insns which share the #-1 constant. On the other hand, sharing the #-1 constant increases the life time of regs and thus increases reg pressure.
Re: [match-and-simplify] reject expanding operator-list to implicit 'for'
On Wed, 20 May 2015, Prathamesh Kulkarni wrote: On 20 May 2015 at 18:18, Richard Biener rguent...@suse.de wrote: On Wed, 20 May 2015, Prathamesh Kulkarni wrote: On 20 May 2015 at 17:01, Richard Biener rguent...@suse.de wrote: On Wed, 20 May 2015, Prathamesh Kulkarni wrote: On 20 May 2015 at 16:17, Prathamesh Kulkarni prathamesh.kulka...@linaro.org wrote: Hi, This patch rejects expanding operator-list to implicit 'for'. On second thoughts, should we reject expansion of operator-list _only_ if it's mixed with 'for' ? At least that, yes. Well I suppose we could extend it to be mixed with 'for' ? Add the operator lists to the inner-most 'for'. eg: (define_operator_list olist ...) (for op (...) (simplify (op (olist ... would be equivalent to: (for op (...) temp (olist) (simplify (op (olist ... operator-list expansion can be said to simply a short-hand for single 'for' with number of iterators = number of operator-lists. If the operator-lists are enclosed within 'for', add them to the innermost 'for'. Yes, but I think this use is confusing as to whether the operator lists form a new for (like currently(?)) or if they append to the enclosing for. What we do currently is consistent (always create a new for) but it is confusing behavior - as you noted initially. Richard. Thanks, Prathamesh We could define multiple operator-lists in simplify to be the same as enclosing the simplify in 'for' with number of iterators equal to number of operator-lists. So we could allow (define_operator_list op1 ...) (define_operator_list op2 ...) (simplify (op1 (op2 ... ))) is equivalent to: (for temp1 (op1) temp2 (op2) (simplify (temp1 (temp2 ... I think we have patterns like these in match-builtin.pd in the match-and-simplify branch And reject mixing of 'for' and operator-lists. Admittedly the implicit 'for' behavior is not obvious from the syntax -;( Hmm, indeed we have for example /* Optimize pow(1.0,y) = 1.0. */ (simplify (POW real_onep@0 @1) @0) and I remember wanting that implicit for to make those less ugly. So can you rework only rejecting it within for? This patch rejects expanding operator-list inside 'for'. OK for trunk after bootstrap+testing ? Ok. Thanks, Richard. Thanks, Prathamesh Thanks, Richard. Thanks, Prathamesh OK for trunk after bootstrap+testing ? Thanks, Prathamesh -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg) -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg) -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
[Bug target/66215] [4.8/4.9/5/6 Regression] Wrong after label NOP emission for -mhotpatch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66215 --- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org --- No, IMHO you can have many debug insns after that and before first real insn. I'd go for something like: rtx_insn *insn = get_insns (); if (!active_insn_p (insn)) insn = next_active_insn (insn); and insert before, rather than after (otherwise you don't handle the hypothetical case of an active insn being the first one). That would require rewriting the nop insertion code after it, because you want to insert the 6 byte nops first. Or just gcc_assert the first insn is not active, or if the first insn is active, emit a NOTE_INSN_DELETED note before that first active insn, emit the nops after that note and perhaps kill the note at the end. Please test void foo (void) { __builtin_unreachable (); } actually generates any active insns.
[Bug rtl-optimization/66207] Switch alpha to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207 --- Comment #4 from Uroš Bizjak ubizjak at gmail dot com --- Native bootstrap with alphaev68-linux-gnu (a BWX architecture) with the patch from Comment #1 succeeded, the testresults are at [1]. Comparing to non-LRA testsuite run, here is only one new test failure in the entire testsuite: FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions (internal compiler error) FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions (test for excess errors) UNRESOLVED: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions compilation failed to produce executable FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-loops (internal compiler error) FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-loops (test for excess errors) UNRESOLVED: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-loops compilation failed to produce executable Executing on host: /space/uros/gcc-build/gcc/xgcc -B/space/uros/gcc-build/gcc/ /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c -fno-diagnostics-show-caret -fdiagnostics-color=never-O3 -fomit-frame-pointer -funroll-loops -w -lm-o ./pr42691.exe(timeout = 300) spawn /space/uros/gcc-build/gcc/xgcc -B/space/uros/gcc-build/gcc/ /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops -w -lm -o ./pr42691.exe^M /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c: In function 'add':^M /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c:32:1: error: unrecognizable insn:^M (insn 87 86 29 5 (set (subreg:DI (reg:V4HI 90) 0)^M (reg:V4HI 94)) /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c:19 -1^M (expr_list:REG_DEAD (reg:V4HI 94)^M (nil)))^M /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c:32:1: internal compiler error: in extract_insn, at recog.c:2341^M 0x1207809c7 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)^M ../../gcc-svn/trunk/gcc/rtl-error.c:110^M 0x120780a17 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)^M ../../gcc-svn/trunk/gcc/rtl-error.c:118^M 0x120747bcf extract_insn(rtx_insn*)^M ../../gcc-svn/trunk/gcc/recog.c:2341^M 0x120b99d5f union_match_dups^M ../../gcc-svn/trunk/gcc/web.c:118^M 0x120b99d5f execute^M ../../gcc-svn/trunk/gcc/web.c:395^M [1] https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg02573.html
[Bug tree-optimization/66233] [4.8/4.9/5/6 Regression] internal compiler error: in expand_fix, at optabs.c:5358
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66233 --- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org --- Sounds like gimple folding issue. We have: vect__4.9_31 = (vector(4) float) { 0, 1, 2, 3 }; vect__5.10_32 = (vector(4) unsigned int) vect__4.9_31; where the first stmt's rhs_code is FLOAT_EXPR and rhs1 is VECTOR_CST vector(4) int, and the second stmt's rhs_code is FIX_TRUNC_EXPR. So, for this combined together we should use VIEW_CONVERT_EXPR, but we use FIX_TRUNC_EXPR.
[Bug rtl-optimization/66207] Switch alpha to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207 --- Comment #5 from Uroš Bizjak ubizjak at gmail dot com --- (In reply to Uroš Bizjak from comment #4) Native bootstrap with alphaev68-linux-gnu (a BWX architecture) with the patch from Comment #1 succeeded, the testresults are at [1]. Comparing to non-LRA testsuite run, here is only one new test failure in the entire testsuite: No, this failure is not RA related.
[Bug bootstrap/66038] [5 regression] (stage 2) build/genmatch issue (gcc/hash-table.h|c) with --disable-checking [ introduced by r218976 ]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66038 --- Comment #21 from Richard Biener rguenth at gcc dot gnu.org --- (In reply to Douglas Mencken from comment #20) I'm lost. “Vanilla” 5.1.0 configured without --disable-checking went thru stage2 w/o any issue... That's interesting - we might run into a miscompilation here. Can you check with --disable-checking again but with just the gcc_checking_assert in hash_table_mod1 removed? Please also attach preprocessed source of genmatch.c for the stage2 build so it's possible to investigate that with a cross compiler. (preprocessed source with --disable-checking and the assert left in place) Btw, thanks for your help in tracking this down. I wonder if anybody tried a powerpc-linux bootstrap with --disable-checking...
[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 --- Comment #5 from Andreas Schwab sch...@linux-m68k.org --- linux/aio_abi.h was added in 2.5.32. https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=ea5097be4e814a2a9457e60653052306295941e8
[Bug c++/66211] [5/6 Regression] Rvalue conversion in ternary operator causes internal compiler error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66211 --- Comment #6 from rguenther at suse dot de rguenther at suse dot de --- On Wed, 20 May 2015, jakub at gcc dot gnu.org wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66211 --- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org --- Perhaps just guard this particular match.pd pattern with GIMPLE guard for now (until the delayed C++ folding is committed)? Will try.
[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 Dmitry Vyukov dvyukov at google dot com changed: What|Removed |Added CC||dvyukov at google dot com --- Comment #6 from Dmitry Vyukov dvyukov at google dot com --- How does LINUX_VERSION_CODE relate to linux kernel version? What is the LINUX_VERSION_CODE value for 2.5.32? Is the other LINUX_VERSION_CODE value (132627) correct?
[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 --- Comment #7 from Andreas Schwab sch...@linux-m68k.org --- Format it as a hexadecimal number.
Re: OpenACC: initialization with unsupported acc_device_t
On Thu, May 21, 2015 at 08:55:59AM +0200, Thomas Schwinge wrote: Thanks, looks good to me -- Jakub? Ok for trunk. libgomp/ * oacc-init.c (resolve_device): Add FAIL_IS_ERROR argument. Update function comment. Only call gomp_fatal if new argument is true. (acc_dev_num_out_of_range): New function. (acc_init_1, acc_shutdown_1): Update call to resolve_device. Call acc_dev_num_out_of_range as appropriate. (acc_get_num_devices, acc_set_device_type, acc_get_device_type) (acc_get_device_num, acc_set_device_num): Update calls to resolve_device. * testsuite/libgomp.oacc-c-c++-common/lib-4.c: Update expected test output. Jakub
[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 --- Comment #9 from Pierre Ossman ossman at cendio dot se --- (In reply to Andreas Schwab from comment #5) linux/aio_abi.h was added in 2.5.32. https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/ ?id=ea5097be4e814a2a9457e60653052306295941e8 How can it be missing on RHEL 4 with 2.6.9 in that case?
Re: [obvious fix] fix off-by-one error when printing the caret character
Manuel López-Ibáñez lopeziba...@gmail.com writes: Index: ChangeLog === --- ChangeLog (revision 223445) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2015-05-20 Manuel López-Ibáñez m...@gcc.gnu.org + + * diagnostic.c (diagnostic_print_caret_line): Fix off-by-one error + when printing the caret character. + This is OK, thanks! Cheers, -- Dodji
[Bug tree-optimization/66233] [4.8/4.9/5/6 Regression] internal compiler error: in expand_fix, at optabs.c:5358
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66233 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2015-05-21 CC||jakub at gcc dot gnu.org Component|c |tree-optimization Target Milestone|--- |4.8.5 Summary|internal compiler error: in |[4.8/4.9/5/6 Regression] |expand_fix, at |internal compiler error: in |optabs.c:5358 |expand_fix, at ||optabs.c:5358 Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org --- Started with r193246, don't see anything invalid on that testcase.
[Bug middle-end/66221] [CHKP, 6 regression] lto1: error: type variant has different TYPE_ARG_TYPES
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66221 --- Comment #1 from Ilya Enkovich ienkovich at gcc dot gnu.org --- Author: ienkovich Date: Thu May 21 08:32:52 2015 New Revision: 223471 URL: https://gcc.gnu.org/viewcvs?rev=223471root=gccview=rev Log: gcc/ PR middle-end/66221 * ipa-chkp.c (chkp_copy_function_type_adding_bounds): Use build_distinct_type_copy to copy bounds. gcc/testsuite/ PR middle-end/66221 * gcc.dg/lto/pr66221_0.c: New test. * gcc.dg/lto/pr66221_1.c: New test. Added: trunk/gcc/testsuite/gcc.dg/lto/pr66221_0.c trunk/gcc/testsuite/gcc.dg/lto/pr66221_1.c Modified: trunk/gcc/ChangeLog trunk/gcc/ipa-chkp.c trunk/gcc/testsuite/ChangeLog
[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 --- Comment #10 from Andreas Schwab sch...@linux-m68k.org --- Are you sure their user-space kernel headers are at 2.6.9 level? https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=31a3791056e43c6dd7386b8bc0f5fb94848c5a61 https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/tree/include/linux?id=31a3791056e43c6dd7386b8bc0f5fb94848c5a61
[PATCH, CHKP] Fix PR middle-end/66221: lto1: error: type variant has different TYPE_ARG_TYPES
Hi, This patch fixes PR66221 by using build_distinct_type_copy instead of copy_node to copy a function type for instrumented function. Bootstrapped and regtested for x86_64-unknown-linux-gnu. Applied to trunk. Is it OK for gcc-5? Thanks, Ilya -- gcc/ 2015-05-21 Ilya Enkovich enkovich@gmail.com PR middle-end/66221 * ipa-chkp.c (chkp_copy_function_type_adding_bounds): Use build_distinct_type_copy to copy bounds. gcc/testsuite/ 2015-05-21 Ilya Enkovich enkovich@gmail.com PR middle-end/66221 * gcc.dg/lto/pr66221_0.c: New test. * gcc.dg/lto/pr66221_1.c: New test. diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c index ac5eb35..c710291 100644 --- a/gcc/ipa-chkp.c +++ b/gcc/ipa-chkp.c @@ -308,7 +308,7 @@ chkp_copy_function_type_adding_bounds (tree orig_type) if (!arg_type) return orig_type; - type = copy_node (orig_type); + type = build_distinct_type_copy (orig_type); TYPE_ARG_TYPES (type) = copy_list (TYPE_ARG_TYPES (type)); for (arg_type = TYPE_ARG_TYPES (type); diff --git a/gcc/testsuite/gcc.dg/lto/pr66221_0.c b/gcc/testsuite/gcc.dg/lto/pr66221_0.c new file mode 100644 index 000..dbb9282 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr66221_0.c @@ -0,0 +1,10 @@ +/* { dg-lto-do link } */ +/* { dg-require-effective-target mpx } */ +/* { dg-lto-options { { -O2 -flto -fcheck-pointer-bounds -mmpx } } } */ + +int test1 (const char *); + +int main (int argc, const char **argv) +{ + return test1 (argv[0]); +} diff --git a/gcc/testsuite/gcc.dg/lto/pr66221_1.c b/gcc/testsuite/gcc.dg/lto/pr66221_1.c new file mode 100644 index 000..4c94544 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr66221_1.c @@ -0,0 +1,4 @@ +int test1 (const char *p) +{ + return (int)(*p); +}
[Bug target/66215] [4.8/4.9/5/6 Regression] Wrong after label NOP emission for -mhotpatch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66215 --- Comment #5 from Dominik Vogt vogt at linux dot vnet.ibm.com --- Wouldn't the correct and easy to identify place be right after the first NOTE_INSN_BASIC_BLOCK?
[Bug c/52952] Wformat location info is bad (wrong column number)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952 --- Comment #38 from Manuel López-Ibáñez manu at gcc dot gnu.org --- Author: manu Date: Thu May 21 06:49:38 2015 New Revision: 223470 URL: https://gcc.gnu.org/viewcvs?rev=223470root=gccview=rev Log: gcc/testsuite/ChangeLog: 2015-05-21 Manuel López-Ibáñez m...@gcc.gnu.org PR c/52952 * gcc.dg/redecl-4.c: Update column numbers. * gcc.dg/format/bitfld-1.c: Likewise. * gcc.dg/format/attr-2.c: Likewise. * gcc.dg/format/attr-6.c: Likewise. * gcc.dg/format/attr-7.c (baz): Likewise. * gcc.dg/format/asm_fprintf-1.c: Likewise. * gcc.dg/format/attr-4.c: Likewise. * gcc.dg/format/branch-1.c: Likewise. * gcc.dg/format/c90-printf-1.c: Likewise. Add tests for column locations within strings with embedded escape sequences. gcc/c-family/ChangeLog: 2015-05-21 Manuel López-Ibáñez m...@gcc.gnu.org PR c/52952 * c-format.c (location_column_from_byte_offset): New. (location_from_offset): New. (struct format_wanted_type): Add offset_loc field. (check_format_info): Move handling of location for extra arguments closer to the point of warning. (check_format_info_main): Pass the result of location_from_offset to warning_at. (format_type_warning): Pass the result of location_from_offset to warning_at. Modified: trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/c-format.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/format/asm_fprintf-1.c trunk/gcc/testsuite/gcc.dg/format/attr-2.c trunk/gcc/testsuite/gcc.dg/format/attr-4.c trunk/gcc/testsuite/gcc.dg/format/attr-6.c trunk/gcc/testsuite/gcc.dg/format/attr-7.c trunk/gcc/testsuite/gcc.dg/format/bitfld-1.c trunk/gcc/testsuite/gcc.dg/format/branch-1.c trunk/gcc/testsuite/gcc.dg/format/c90-printf-1.c trunk/gcc/testsuite/gcc.dg/redecl-4.c
Re: [nvptx] Re: Mostly rewrite genrecog
Hi! On Thu, 7 May 2015 11:14:37 +0200, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 07, 2015 at 10:59:01AM +0200, Thomas Schwinge wrote: build/genrecog [...]/source-gcc/gcc/common.md [...]/source-gcc/gcc/config/nvptx/nvptx.md \ insn-conditions.md tmp-recog.c -[...]/source-gcc/gcc/config/nvptx/nvptx.md:1206: warning: operand 0 missing mode? -[...]/source-gcc/gcc/config/nvptx/nvptx.md:1206: warning: operand 1 missing mode? gcc/config/nvptx/nvptx.md: 1206 (define_insn allocate_stack 1207 [(set (match_operand 0 nvptx_register_operand =R) 1208 (unspec [(match_operand 1 nvptx_register_operand R)] 1209 UNSPEC_ALLOCA))] 1210 1211 %.\\tcall (%0), %%alloca, (%1);) Are these two (former) warnings a) something that should still be reported by genrecog, Yes. http://news.gmane.org/find-root.php?message_id=%3C87twvjtrf4.fsf%40e105548-lin.cambridge.arm.com%3E. and b) something that should be addressed (Bernd)? Yes. Supposedly you want :P on both match_operand and unspec too, but as this serves not just as an insn pattern, but also as expander that needs to have this particular name, supposedly you want: (define_expand allocate_stack [(match_operand 0 nvptx_register_operand) (match_operand 1 nvptx_register_operand)] { if (TARGET_ABI64) emit_insn (gen_allocate_stack_di (operands[0], operands[1])); else emit_insn (gen_allocate_stack_si (operands[0], operands[1])); DONE; }) (define_insn allocate_stack_mode [(set (match_operand:P 0 nvptx_register_operand =R) (unspec:P [(match_operand:P 1 nvptx_register_operand R)] UNSPEC_ALLOCA))] %.\\tcall (%0), %%alloca, (%1);) rr so. OK to commit? commit 004e521e8dd1c0236a55e9a69a17ccc2a41d Author: Thomas Schwinge tho...@codesourcery.com Date: Thu May 7 11:30:26 2015 +0200 [nvptx] Address genrecog warnings 2015-05-21 Jakub Jelinek ja...@redhat.com gcc/ * config/nvptx/nvptx.md (allocate_stack): Rename to... (allocate_stack_mode): ... this, and add :P on both match_operand and unspec. (allocate_stack): New expander. --- gcc/config/nvptx/nvptx.md | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git gcc/config/nvptx/nvptx.md gcc/config/nvptx/nvptx.md index c30de36..a49786c 100644 --- gcc/config/nvptx/nvptx.md +++ gcc/config/nvptx/nvptx.md @@ -1203,10 +1203,22 @@ sorry (target cannot support nonlocal goto.); }) -(define_insn allocate_stack - [(set (match_operand 0 nvptx_register_operand =R) - (unspec [(match_operand 1 nvptx_register_operand R)] - UNSPEC_ALLOCA))] +(define_expand allocate_stack + [(match_operand 0 nvptx_register_operand) + (match_operand 1 nvptx_register_operand)] + +{ + if (TARGET_ABI64) +emit_insn (gen_allocate_stack_di (operands[0], operands[1])); + else +emit_insn (gen_allocate_stack_si (operands[0], operands[1])); + DONE; +}) + +(define_insn allocate_stack_mode + [(set (match_operand:P 0 nvptx_register_operand =R) +(unspec:P [(match_operand:P 1 nvptx_register_operand R)] + UNSPEC_ALLOCA))] %.\\tcall (%0), %%alloca, (%1);) Of course, as even latest Cuda drop doesn't support alloca, this is quite dubious, perhaps better would be sorry on it. BTW, with Cuda 7.0, even printf doesn't work anymore, is that known? I have not yet used that version of CUDA, so don't know about this. :-| Grüße, Thomas pgpFi4uhk1oJs.pgp Description: PGP signature
[Bug c/66230] Using optimizations causes program to segfault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 Markus Trippelsdorf trippels at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2015-05-21 CC||trippels at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Markus Trippelsdorf trippels at gcc dot gnu.org --- Please attache a small self-contained testcase. Nobody here has the time to clone and build random projects. And normally issues like these are caused by invoking undefined behavior. Try to build the project with -fsanitize=undefined and see what runtime errors it reports.
Re: Add statistics to alias.c
On May 21, 2015 12:13:19 AM GMT+02:00, Jan Hubicka hubi...@ucw.cz wrote: Hi, this patch extends statistics from tree-ssa-alias to also cover TBAA oracle. This is useful to keep track of aliasing effectivity. For example the hack in alias.c putting globbing all pointers to one costs about 20% of all answers on firefox. I.e. from 15500978 disambiguations/23744267 querries (with the hack removed) to 12932078 disambiguations/27256455 querries. Bootstrapped x86_64-linux, OK? s/quaries/queries/ Thanks,
[Bug c/52952] Wformat location info is bad (wrong column number)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952 --- Comment #39 from Manuel López-Ibáñez manu at gcc dot gnu.org --- A summary of what is still pending: 1. Handle macros #define c%d __builtin_printf(c, 0.5); 2. Handle non-contiguous strings: __builtin_printf( % d , 0.5); 3. Handle const arrays: const char a[] = %d ; __builtin_printf(a, 0.5); I have an idea on how to fix 1 and 2 but no idea how to fix 3.
[Bug c/66233] New: internal compiler error: in expand_fix, at optabs.c:5358
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66233 Bug ID: 66233 Summary: internal compiler error: in expand_fix, at optabs.c:5358 Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: usignao at gmail dot com Target Milestone: --- Hello! The error is produced by the following (invalid) code /* oops.c */ unsigned int pData[5]; void f() { int i; for(i=0; i5; i++) { pData[i] = (float) i; } } $ gcc -O3 -Wall -Wextra -o oops.o -c oops.c oops.c: In function ‘f’: oops.c:6:12: internal compiler error: in expand_fix, at optabs.c:5358 pData[i] = (float) i; ^ No warnings are given. I'm on Linux x64, gcc version is 4.9.2, but according to the godbolt.org all the versions from 4.8 up to 5.1.0 are also affected.
[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 Pierre Ossman ossman at cendio dot se changed: What|Removed |Added CC||ossman at cendio dot se --- Comment #4 from Pierre Ossman ossman at cendio dot se --- Unfortunately the git history doesn't go further back than 2.6.12 so I don't know when aio_abi.h got added. But the code only needs the enum, so something like this should work: #include linux/version.h /* aio_abi.h was added in 2.6.10 (FIXME: check this) */ #if LINUX_VERSION_CODE 132624 enum { IOCB_CMD_PREAD = 0, IOCB_CMD_PWRITE = 1, IOCB_CMD_FSYNC = 2, IOCB_CMD_FDSYNC = 3, IOCB_CMD_NOOP = 6, }; #else #include_next linux/aio_abi.h #endif /* IOCB_CMD_PREADV/PWRITEV has been added in 2.6.19 */ #if LINUX_VERSION_CODE 132627 #define IOCB_CMD_PREADV 7 #define IOCB_CMD_PWRITEV 8 #endif
Re: PING: Re: [patch 6/10] debug-early merge: Java front-end
On 20/05/15 23:32, Aldy Hernandez wrote: Perhaps I should've sent this to the java-patches list. PING. OK, I believe it. Andrew.
Re: [PATCH][DRIVER] Wrong C++ include paths when configuring with --with-sysroot=/
Hi, On 8 May 2015 at 00:07, Joseph Myers jos...@codesourcery.com wrote: On Mon, 20 Apr 2015, Pavel Kopyl wrote: Hi all, To build a GCC-4.9.2 ARM cross-compiler for my setting I need to configure it with --with-sysroot=/ --with-gxx-include-dir=/usr/include/c++/4.9.2. But I found that gcc driver removes the leading slash from resulting paths: `gcc -print-prog-name=cc1plus` -v ... ignoring nonexistent directory usr/include/c++/4.9.2 - HERE ignoring nonexistent directory usr/include/c++/4.9.2/armv7l-tizen-linux-gnueabi - AND HERE ignoring nonexistent directory usr/include/c++/4.9.2/backward - AND HERE #include ... search starts here: #include ... search starts here: /usr/lib/gcc/armv7l-tizen-linux-gnueabi/4.9.2/include /usr/local/include /usr/lib/gcc/armv7l-tizen-linux-gnueabi/4.9.2/include-fixed /usr/include It's also reproducible on trunk. Attached patch fixes this bug. You don't explain the rationale for this patch, in terms of the logical semantics of the various variables involved, and why, in terms of those logical semantics, this patch is the correct approach for fixing the observed problems. As I read the code, it's not the driver that removes the leading slash. Rather, it's the code in configure.ac: gcc_gxx_include_dir_add_sysroot=0 if test ${with_sysroot+set} = set; then gcc_gxx_without_sysroot=`expr ${gcc_gxx_include_dir} : ${with_sysroot}'\(.*\)'` if test ${gcc_gxx_without_sysroot}; then gcc_gxx_include_dir=${gcc_gxx_without_sysroot} gcc_gxx_include_dir_add_sysroot=1 fi fi What I'd say is that this code is mishandling the case of a --with-sysroot path that ends with '/' (or, I suppose, '\', on hosts where that's a directory separator). That is, it's producing a gcc_gxx_include_dir setting with no leading '/'. I think it would be more appropriate for this configure.ac code to remove (sysroot minus trailing directory separator), so that the gcc_gxx_include_dir setting after this code still has a leading directory separator whether or not the --with-sysroot setting ended with such a separator. Given such a configure.ac change, I wouldn't then expect changes elsewhere in the compiler to be needed. But if that doesn't work to fix the bug, I think you need to elaborate further on the semantics of the various variables involved (in configure.ac and in the compiler). There is this old patch submitted by Matthias on that same issue, if its logic is the right one for you Joseph I can rebase/validate it Joseph. https://gcc.gnu.org/ml/gcc-patches/2012-02/msg00320.html Cheers, Yvan -- Joseph S. Myers jos...@codesourcery.com
Re: OpenACC: initialization with unsupported acc_device_t
Hi Julian! On Thu, 7 May 2015 16:56:11 +0100, Julian Brown jul...@codesourcery.com wrote: On Tue, 5 May 2015 16:09:18 +0200 Thomas Schwinge tho...@codesourcery.com wrote: On Tue, 5 May 2015 08:43:48 -0400, John David Anglin dave.ang...@bell.net wrote: On 2015-05-05 5:43 AM, Thomas Schwinge wrote: FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-62.c -DACC_DEVICE_TYPE_hos t=1 -DACC_MEM_SHARED=1 output pattern test, is , should match invalid size With this one I'll need your help: please cite from libgomp.log (or, from a manual run) the actual output message that you're getting. There's no output message: # ./lib-62.exe Segmentation fault (core dumped) As this is a PA-RISC HP-UX system, I feel certain that you don't actually have nvptx offloading available (so, the nvptx libgomp plugin is not being built). However, this test case, contains an unconditional acc_init call for acc_device_nvidia, and I would then guess that this situation is not (not anymore?) correctly handled (abort with »offloading to [...] not possible«, or similar; see libgomp.oacc-c-c++-common/lib-4.c) in libgomp -- Julian, could this be due to your recent libgomp OpenACC initialization changes? (When working on this in a build that does have nvptx offloading configured, I think you should be able to simulate the situation by hiding (temporarily deleting, or similar) the nvptx libgomp plugin?) The attached patch contains (what I hope should be) a fix for this, tested by running the libgomp testsuite (with nvptx offloading), and by deleting the nvptx plugin, with the patch applied, and ensuring that lib-62.c no longer segfaults in that case. The patch also tidies up a few other error paths around resolve_device, and de-duplicates some error message reporting code. Then, I don't know why libgomp.oacc-c-c++-common/lib-62.c contains this explicit acc_init call with acc_device_nvidia -- generally, the test cases should not contain such unconditional statements. So, let's then please remove this. See libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c for a very similar test case, which does this differently. I've not touched this test though -- but I have tweaked libgomp.oacc-c-c++-common/lib-4.c that should now expect a slightly different error output. OK for trunk? Thanks, looks good to me -- Jakub? Grüße, Thomas libgomp/ * oacc-init.c (resolve_device): Add FAIL_IS_ERROR argument. Update function comment. Only call gomp_fatal if new argument is true. (acc_dev_num_out_of_range): New function. (acc_init_1, acc_shutdown_1): Update call to resolve_device. Call acc_dev_num_out_of_range as appropriate. (acc_get_num_devices, acc_set_device_type, acc_get_device_type) (acc_get_device_num, acc_set_device_num): Update calls to resolve_device. * testsuite/libgomp.oacc-c-c++-common/lib-4.c: Update expected test output. commit 221b5dea47cdb7611456ca3cf28d180d3ff1156a Author: Julian Brown jul...@codesourcery.com Date: Thu May 7 08:39:16 2015 -0700 Clean up initialisation when no devices of a particular type are available. diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index f2c60ec..cd50521 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -109,10 +109,12 @@ name_of_acc_device_t (enum acc_device_t type) } } -/* ACC_DEVICE_LOCK should be held before calling this function. */ +/* ACC_DEVICE_LOCK must be held before calling this function. If FAIL_IS_ERROR + is true, this function raises an error if there are no devices of type D, + otherwise it returns NULL in that case. */ static struct gomp_device_descr * -resolve_device (acc_device_t d) +resolve_device (acc_device_t d, bool fail_is_error) { acc_device_t d_arg = d; @@ -130,7 +132,13 @@ resolve_device (acc_device_t d) dispatchers[d]-get_num_devices_func () 0) goto found; - gomp_fatal (device type %s not supported, goacc_device_type); + if (fail_is_error) + { + gomp_mutex_unlock (acc_device_lock); + gomp_fatal (device type %s not supported, goacc_device_type); + } + else + return NULL; } /* No default device specified, so start scanning for any non-host @@ -149,7 +157,13 @@ resolve_device (acc_device_t d) d = acc_device_host; goto found; } - gomp_fatal (no device found); + if (fail_is_error) +{ + gomp_mutex_unlock (acc_device_lock); + gomp_fatal (no device found); + } + else +return NULL; break; case acc_device_host: @@ -157,7 +171,12 @@ resolve_device (acc_device_t d) default: if (d _ACC_device_hwm) - gomp_fatal (device %u out of range, (unsigned)d); + { + if (fail_is_error) +
[Bug sanitizer/61955] libsanitizer fails to compile on RHEL4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61955 --- Comment #11 from Pierre Ossman ossman at cendio dot se --- Not really. :) I stumbled upon this trying to use 2.4 headers, so I honestly haven't tried 2.6.9, RHEL variant or otherwise.
Re: [Patch AArch64] PR target/66200 - gcc / libstdc++ TLC for weak memory models.
And here's an additional patch for the testsuite which was missed in the original posting. This is a testism that's testing code generation as per TARGET_RELAXED_ORDERING being false and therefore needs to be adjusted as attached. Ramana PR target/66200 * g++.dg/abi/aarch64_guard1.C: Adjust testcase. diff --git a/gcc/testsuite/g++.dg/abi/aarch64_guard1.C b/gcc/testsuite/g++.dg/abi/aarch64_guard1.C index ca1778b..e78f93c 100644 --- a/gcc/testsuite/g++.dg/abi/aarch64_guard1.C +++ b/gcc/testsuite/g++.dg/abi/aarch64_guard1.C @@ -13,5 +13,4 @@ int *foo () } // { dg-final { scan-assembler _ZGVZ3foovE1x,8,8 } } -// { dg-final { scan-tree-dump _ZGVZ3foovE1x 1 original } } // { dg-final { cleanup-tree-dump original } }
[Bug middle-end/66221] [CHKP, 6 regression] lto1: error: type variant has different TYPE_ARG_TYPES
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66221 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug target/66215] [4.8/4.9/5/6 Regression] Wrong after label NOP emission for -mhotpatch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66215 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |4.8.5
Re: [PATCH] [PATCH][ARM] Fix sibcall testcases.
On Wed, May 20, 2015 at 9:11 PM, Joseph Myers jos...@codesourcery.com wrote: On Wed, 20 May 2015, Alex Velenko wrote: Hi, This patch prevents arm_thumb1_ok XPASS in sibcall-3.c and sibcall-4.c testcases. Sibcalls are not ok for Thumb1 and testcases need to be fixed. arm_thumb1_ok means this is an ARM target where -mthumb causes Thumb-1 to be used. It only ever makes sense to use it in tests that use an explicit -mthumb, which these tests don't. If you want to check is this test being built for Thumb-1 by the multilib options, use arm_thumb1. Alex, so while you are here - why don't you improve the documentation in sourcebuild.texi by 1. documenting arm_thumb1 2. distinguishing that from arm_thumb1_ok which just says `ARM target generates Thumb-1 code for @code{-mthumb}.' and that is just meaningless. regards Ramana -- Joseph S. Myers jos...@codesourcery.com
[Bug c++/66223] Diagnostic of pure virtual function call broken, including __cxa_pure_virtual
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66223 Jonathan Wakely redi at gcc dot gnu.org changed: What|Removed |Added Status|NEW |WAITING --- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org --- I still get the same output for 5.1: pure virtual method called terminate called without an active exception Aborted (core dumped) How did you configure GCC? (Please provide the output of 'gcc -v`' as requested in the bug reporting instructions)
Re: [patch, testsuite] don't specify dg-do run explicitly for vect test cases
On Thu, May 21, 2015 at 7:12 AM, Sandra Loosemore san...@codesourcery.com wrote: On targets such as ARM, some arches are compatible with options needed to enable compilation with vectorization, but the specific hardware (or simulator or BSP) available for execution tests may not implement or enable those features. The vect.exp test harness already includes some magic to determine whether the target hw can execute vectorized code and sets dg-do-what-default to compile the tests only if they can't be executed. It's a mistake for individual tests to explicitly say dg-do run because this overrides the harness's magic default and forces the test to be executed, even if doing so just ends up wedging the target. I already committed two patches last fall (r215627 and r218427) to address this, but people keep adding new vect test cases with the same problem, so here is yet another installment to clean them up. I tested this on arm-none-eabi with a fairly large collection of multilibs. OK to commit? Huh... I thought we have the check_vect () stuff for that...? -Sandra
[Patch ARM] Fix PR target/65937
Testism introduced by last commit to fix PR26702 on arm-*-linux* targets. The fix is to restore target selector to arm*-*-eabi* as the target macro changes only affect arm*-*-eabi* Applied to trunk as obvious Ramana * gcc.target/arm/pr26702.c: Adjust target selector. Index: gcc.target/arm/pr26702.c === --- gcc.target/arm/pr26702.c(revision 223444) +++ gcc.target/arm/pr26702.c(working copy) @@ -1,4 +1,4 @@ -/* { dg-do compile { target arm_eabi } } */ +/* { dg-do compile { target arm*-*-eabi* } } */ /* { dg-final { scan-assembler \\.size\[\\t \]+static_foo, 4 } } */ int foo; static int static_foo;
[Bug c/66230] Using optimizations causes program to segfault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #2 from gpnuma at centaurean dot com --- I understand you're short of time but this problem is very difficult to reproduce !! I did try to compile and link with -fsanitize=undefined this morning, now here's the interesting part : * no warning was generated by ubsan * everything works fine As soon as I remove -fsanitize=undefined, I get the segfault again, so I suspect the problem happens during the optimization stages. The fact that if I add a useless line of code like printf(...) at the start of the called function cancelling the problem makes me wonder if it could be that the function pointer is not properly captured by gcc or that it changes after optimizations. Here is what I'm doing to be more accurate : 1) I have a set of functions at the top of a file (functionA, functionB, ...) 2) At the bottom of that file I have another function which stores the function pointers of these functions using functionA, functionB etc... in an array. 3) Later on, I access the functions using an index to that array, and with gcc 4.8 / -O3 *only*, this fails and segfaults. So my thinking is maybe the function pointers are stored correctly, but then the optimizer changes this function's address or the function itself making the initial pointer wrong which leads to a segfault... just a wild guess. I think that adding the printf or a void function maybe adds some sort of unoptimizable code at the start (like IO) and therefore the initial stored pointer is unchanged after optimizations. Oh yeah, it's worth mentioning that otherwise (if I don't put a bogus printf) the first line of code of the function is a __builtin_memcpy which is probably highly optimizable. I'll try to come up with a short code example if I get the time later on. Thank you Guillaume
[Bug lto/66228] Compiling simple program with -flto -O1 causes mad behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66228 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |NEW Last reconfirmed||2015-05-21 CC||hubicka at gcc dot gnu.org Ever confirmed|0 |1 Known to fail||4.8.4, 4.9.2 --- Comment #1 from Richard Biener rguenth at gcc dot gnu.org --- I think this is an effective duplicate of PR61886. With GCC 5 I get rguenther@murzim:/tmp gcc-5 t.i -O -flto /usr/include/bits/error.h: In function ‘error’: /usr/include/bits/error.h:37:1: error: inlining failed in call to always_inline ‘error’: recursive inlining error (int __status, int __errnum, const char *__format, ...) ^ /usr/include/bits/error.h:40:5: error: called from here __error_noreturn (__status, __errnum, __format, __va_arg_pack ()); ^ /usr/include/bits/error.h:37:1: error: inlining failed in call to always_inline ‘error’: recursive inlining error (int __status, int __errnum, const char *__format, ...) ^ /usr/include/bits/error.h:40:5: error: called from here __error_noreturn (__status, __errnum, __format, __va_arg_pack ()); ^ ... I can reproduce the odd code generation with GCC 4.9 and 4.8. I suspect the issue is related to the above.
[Bug c/66219] The gcc generated section start/stop pointers become undefined when option -flto is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66219 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Richard Biener rguenth at gcc dot gnu.org --- Well, likely flags[] is optimized away. If you need the symbol to prevail (for whatever reason?) you need to add __attribute__((used)) to it.
[Bug tree-optimization/66233] [4.8/4.9/5/6 Regression] internal compiler error: in expand_fix, at optabs.c:5358
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66233 --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org --- Seems this is the /* Handle cases of two conversions in a row. */ patterns in match.pd that are causing this. I'd say the bug is that those simplifications are just handling {inside,inter,final}_vec the same, no matter if it is vectors of float, ints, unsigned ints etc. Supposedly before match.pd has been added the bug was elsewhere, but similarly didn't take care precisely what kind of vectors it is optimizing. FLOAT_EXPR is used for conversion of vector {int,unsigned} to vector float, FIX_TRUNC_EXPR fpr vector float to vector {int,unsigned} and convert (NOP_EXPR/VIEW_CONVERT_EXPR?) for other conversions.
[Bug tree-optimization/66233] [4.8/4.9/5/6 Regression] internal compiler error: in expand_fix, at optabs.c:5358
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66233 --- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org --- Indeed, in 4.9 this is in tree-ssa-forwprop.c (combine_conversions) and in fold-const.c (fold_unary_loc). Perhaps we need {inter,inside,final}_vec_{int,float,unsignedp} variables too and use them?
Re: [patch, testsuite, ARM] don't try to execute simd.exp tests on targets without NEON
Hi Sandra, On 21/05/15 06:43, Sandra Loosemore wrote: This is another patch aimed at fixing bugs relating to trying to execute NEON code on a target that doesn't support it revealed by my arm-none-eabi testing on a gazillion different multilibs. Inspired by what vect.exp does and my other patch in this group to fix advsimd-intrinsics.exp, I've hacked simd.exp to test for NEON compilation and execution support and use set dg-do-what-default to either compile or run as appropriate, or skip the whole set of tests if neither is present. And, I've removed the explicit dg-do run and arm_neon_ok test (which only tests for compilation support, not execution support) from all the individual test cases. OK to commit? This is ok and there is one less headache with NEON testing :) Thanks, Kyrill -Sandra
[Bug rtl-optimization/66236] New: [6 Regression] FAIL: gcc.c-torture/execute/pr42691.c on alpha-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66236 Bug ID: 66236 Summary: [6 Regression] FAIL: gcc.c-torture/execute/pr42691.c on alpha-linux-gnu Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Target: alpha-linux-gnu FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions (internal compiler error) FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions (test for excess errors) UNRESOLVED: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions compilation failed to produce executable FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-loops (internal compiler error) FAIL: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-loops (test for excess errors) UNRESOLVED: gcc.c-torture/execute/pr42691.c -O3 -fomit-frame-pointer -funroll-loops compilation failed to produce executable Executing on host: /space/uros/gcc-build/gcc/xgcc -B/space/uros/gcc-build/gcc/ /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c -fno-diagnostics-show-caret -fdiagnostics-color=never-O3 -fomit-frame-pointer -funroll-loops -w -lm-o ./pr42691.exe(timeout = 300) spawn /space/uros/gcc-build/gcc/xgcc -B/space/uros/gcc-build/gcc/ /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops -w -lm -o ./pr42691.exe^M /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c: In function 'add':^M /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c:32:1: error: unrecognizable insn:^M (insn 87 86 29 5 (set (subreg:DI (reg:V4HI 90) 0)^M (reg:V4HI 94)) /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c:19 -1^M (expr_list:REG_DEAD (reg:V4HI 94)^M (nil)))^M /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/gcc.c-torture/execute/pr42691.c:32:1: internal compiler error: in extract_insn, at recog.c:2341^M 0x1207809c7 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)^M ../../gcc-svn/trunk/gcc/rtl-error.c:110^M 0x120780a17 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)^M ../../gcc-svn/trunk/gcc/rtl-error.c:118^M 0x120747bcf extract_insn(rtx_insn*)^M ../../gcc-svn/trunk/gcc/recog.c:2341^M 0x120b99d5f union_match_dups^M ../../gcc-svn/trunk/gcc/web.c:118^M 0x120b99d5f execute^M ../../gcc-svn/trunk/gcc/web.c:395^M
Re: [PATCH, CHKP] Fix PR middle-end/66221: lto1: error: type variant has different TYPE_ARG_TYPES
On Thu, May 21, 2015 at 10:38 AM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch fixes PR66221 by using build_distinct_type_copy instead of copy_node to copy a function type for instrumented function. Bootstrapped and regtested for x86_64-unknown-linux-gnu. Applied to trunk. Is it OK for gcc-5? Ok. Thanks, Richard. Thanks, Ilya -- gcc/ 2015-05-21 Ilya Enkovich enkovich@gmail.com PR middle-end/66221 * ipa-chkp.c (chkp_copy_function_type_adding_bounds): Use build_distinct_type_copy to copy bounds. gcc/testsuite/ 2015-05-21 Ilya Enkovich enkovich@gmail.com PR middle-end/66221 * gcc.dg/lto/pr66221_0.c: New test. * gcc.dg/lto/pr66221_1.c: New test. diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c index ac5eb35..c710291 100644 --- a/gcc/ipa-chkp.c +++ b/gcc/ipa-chkp.c @@ -308,7 +308,7 @@ chkp_copy_function_type_adding_bounds (tree orig_type) if (!arg_type) return orig_type; - type = copy_node (orig_type); + type = build_distinct_type_copy (orig_type); TYPE_ARG_TYPES (type) = copy_list (TYPE_ARG_TYPES (type)); for (arg_type = TYPE_ARG_TYPES (type); diff --git a/gcc/testsuite/gcc.dg/lto/pr66221_0.c b/gcc/testsuite/gcc.dg/lto/pr66221_0.c new file mode 100644 index 000..dbb9282 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr66221_0.c @@ -0,0 +1,10 @@ +/* { dg-lto-do link } */ +/* { dg-require-effective-target mpx } */ +/* { dg-lto-options { { -O2 -flto -fcheck-pointer-bounds -mmpx } } } */ + +int test1 (const char *); + +int main (int argc, const char **argv) +{ + return test1 (argv[0]); +} diff --git a/gcc/testsuite/gcc.dg/lto/pr66221_1.c b/gcc/testsuite/gcc.dg/lto/pr66221_1.c new file mode 100644 index 000..4c94544 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr66221_1.c @@ -0,0 +1,4 @@ +int test1 (const char *p) +{ + return (int)(*p); +}
Re: [RFA] Restore combine.c split point for multiply-accumulate instructions
On Thu, May 21, 2015 at 7:38 AM, Jeff Law l...@redhat.com wrote: find_split_point will tend to favor splitting complex insns in such a way as to encourage multiply-add insns. It does this by splitting an unrecognizable insn at the (plus (mult)). Now that many MULTs are canonicalized as ASHIFT, that code to prefer the multiply-add is no longer triggering when it could/should. This ultimately results in splitting at the ASHIFT rather than the containing PLUS and thus we generate distinct shift and add insns rather than a single shadd insn on the PA (and probably other architectures). This patch will treat (plus (ashift)) just like (plus (mult)) which encourages creation of shift-add insns. This has been bootstrapped and regression tested on x86-unknown-linux-gnu and with an hppa2.0w-hp-hpux11.00 cross compiler on the hppa.exp testsuite (full disclosure -- hppa.exp only has two tests, so it's far from extensive). I've also verified this is one of the changes ultimately necessary to resolve the code generation regressions caused by Venkat's combine.c change on the PA across my 300+ testfiles for a PA cross compiler. OK for the trunk? Sounds reasonable. Thanks, Richard. Jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 490386e..250fa0a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,8 @@ 2015-05-20 Jeff Law l...@redhat.com + * combine.c (find_split_point): Handle ASHIFT like MULT to encourage + multiply-accumulate/shift-add insn generation. + * config/pa/pa.c (pa_print_operand): New 'o' output modifier. (pa_mem_shadd_constant_p): Renamed from pa_shadd_constant_p. (pa_shadd_constant_p): Allow constants for shadd insns rather diff --git a/gcc/combine.c b/gcc/combine.c index a90849e..ab6de3a 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -5145,7 +5163,9 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src) /* Split at a multiply-accumulate instruction. However if this is the SET_SRC, we likely do not have such an instruction and it's worthless to try this split. */ - if (!set_src GET_CODE (XEXP (x, 0)) == MULT) + if (!set_src + (GET_CODE (XEXP (x, 0)) == MULT + || GET_CODE (XEXP (x, 0)) == ASHIFT)) return loc; default: diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index f20a131..bac0973 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,5 +1,7 @@ 2015-05-20 Jeff Law l...@redhat.com + * gcc.target/hppa/shadd-2.c: New test. + * gcc.target/hppa/hppa.exp: New target test driver. * gcc.target/hppa/shadd-1.c: New test. diff --git a/gcc/testsuite/gcc.target/hppa/shadd-2.c b/gcc/testsuite/gcc.target/hppa/shadd-2.c new file mode 100644 index 000..34708e5 --- /dev/null +++ b/gcc/testsuite/gcc.target/hppa/shadd-2.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ +/* { dg-final { scan-assembler-times sh.add 2 } } */ + +typedef struct rtx_def *rtx; +typedef const struct rtx_def *const_rtx; +enum machine_mode +{ + VOIDmode, BLKmode, CCmode, CCGCmode, CCGOCmode, CCNOmode, CCAmode, CCCmode, +CCOmode, CCSmode, CCZmode, CCFPmode, CCFPUmode, BImode, QImode, HImode, +SImode, DImode, TImode, OImode, QQmode, HQmode, SQmode, DQmode, TQmode, +UQQmode, UHQmode, USQmode, UDQmode, UTQmode, HAmode, SAmode, DAmode, +TAmode, UHAmode, USAmode, UDAmode, UTAmode, SFmode, DFmode, XFmode, +TFmode, SDmode, DDmode, TDmode, CQImode, CHImode, CSImode, CDImode, +CTImode, COImode, SCmode, DCmode, XCmode, TCmode, V2QImode, V4QImode, +V2HImode, V1SImode, V8QImode, V4HImode, V2SImode, V1DImode, V16QImode, +V8HImode, V4SImode, V2DImode, V1TImode, V32QImode, V16HImode, V8SImode, +V4DImode, V2TImode, V64QImode, V32HImode, V16SImode, V8DImode, V4TImode, +V2SFmode, V4SFmode, V2DFmode, V8SFmode, V4DFmode, V2TFmode, V16SFmode, +V8DFmode, V4TFmode, MAX_MACHINE_MODE, NUM_MACHINE_MODES = MAX_MACHINE_MODE +}; +struct rtx_def +{ + __extension__ enum machine_mode mode:8; +}; +struct target_regs +{ + unsigned char x_hard_regno_nregs[53][MAX_MACHINE_MODE]; +}; +extern void oof (void); +extern int rhs_regno (rtx); + +extern struct target_regs default_target_regs; +__inline__ unsigned int +end_hard_regno (enum machine_mode mode, unsigned int regno) +{ + return regno + +((default_target_regs)-x_hard_regno_nregs)[regno][(int) mode]; +} + +void +note_btr_set (rtx dest, const_rtx set + __attribute__ ((__unused__)), void *data) +{ + int regno, end_regno; + end_regno = end_hard_regno (((dest)-mode), (rhs_regno (dest))); + for (; regno end_regno; regno++) +oof (); +}
[Bug target/64208] [4.9 Regression][iwmmxt] ICE: internal compiler error: Max. number of generated reload insns per insn is achieved (90)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64208 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED CC||ramana at gcc dot gnu.org Resolution|--- |FIXED Target Milestone|4.9.3 |6.0 --- Comment #6 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- Fixed on trunk.
[Bug libgcc/58660] ARM/Thumb non-interworking code broken in libgcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58660 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2015-05-21 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- Confirmed.
[gomp4.1] Taskloop support
Hi! This patch finishes the C #pragma omp taskloop support on the gomp 4.1 branch, including library support. 2015-05-21 Jakub Jelinek ja...@redhat.com * tree.h (OMP_STANDALONE_CLAUSES): Adjust to cover OMP_TARGET_{ENTER,EXIT}_DATA. (OMP_CLAUSE_SHARED_FIRSTPRIVATE): Define. * gimplify.c (gimplify_scan_omp_clauses): Add lastprivate clause to outer taskloop if needed. (gimplify_omp_for): Fix a typo. Fixup OMP_TASKLOOP gimplification. * omp-low.c (omp_copy_decl_2): If var is TREE_ADDRESSABLE listed in task_shared_vars, clear TREE_ADDRESSABLE on the copy. (build_outer_var_ref): Add lastprivate argument, pass it through recursively. Handle lastprivate on taskloop construct. (install_var_field): Allow multiple fields for a single decl - one for firstprivate, another for shared clauses on task. (scan_sharing_clauses): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. (add_taskreg_looptemp_clauses): Add one more _looptemp_ clause for taskloop GIMPLE_OMP_TASK, if it is collapse 1 with non-constant iteration count and there is lastprivate clause on the inner GIMPLE_OMP_FOR. (finish_taskreg_scan): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. (lower_rec_input_clauses): Likewise. Ignore all OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE clauses on taskloop construct. (lower_lastprivate_clauses): For OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE on taskloop lookup decl in outer context. Pass true to build_outer_var_ref lastprivate argument. (lower_send_clauses): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. (lower_send_shared_vars): Ignore fields with NULL or FIELD_DECL abstract origin. (expand_task_call): Use GOMP_TASK_* defines instead of hardcoded integers. (expand_omp_simd): Handle addressable fd-loop.v. (expand_omp_taskloop_for_outer): Initialize the last _looptemp_ with total iteration count if needed. (expand_omp_taskloop_for_inner): Handle bias and broken_loop. (lower_omp_for_lastprivate): Use last _looptemp_ clause on taskloop for comparison. (create_task_copyfn): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. gcc/c-family/ * c-omp.c (c_finish_omp_for): Clear DECL_INITIAL. gcc/testsuite/ * gcc.dg/gomp/taskloop-1.c: New test. include/ * gomp-constants.h (GOMP_TASK_FLAG_UNTIED, GOMP_TASK_FLAG_FINAL, GOMP_TASK_FLAG_MERGEABLE, GOMP_TASK_FLAG_DEPEND, GOMP_TASK_FLAG_UP, GOMP_TASK_FLAG_GRAINSIZE, GOMP_TASK_FLAG_IF, GOMP_TASK_FLAG_NOGROUP): Define. libgomp/ * libgomp.map (GOMP_4.1): Export GOMP_taskloop and GOMP_taskloop_ull. * task.c: Include gomp-constants.h. Include taskloop.c twice with appropriate macros. (GOMP_task): Use GOMP_TASK_FLAG_* defines instead of hardcoded constants. * taskloop.c: New file. * testsuite/libgomp.c/for-4.c: New test. * testsuite/libgomp.c/taskloop-1.c: New test. * testsuite/libgomp.c/taskloop-2.c: New test. * testsuite/libgomp.c/taskloop-3.c: New test. --- gcc/tree.h.jj 2015-05-19 18:56:50.982256719 +0200 +++ gcc/tree.h 2015-05-19 19:04:52.496759752 +0200 @@ -1206,7 +1206,7 @@ extern void protected_set_expr_location /* Generic accessors for OMP nodes that keep clauses as operand 0. */ #define OMP_STANDALONE_CLAUSES(NODE) \ - TREE_OPERAND (TREE_RANGE_CHECK (NODE, OACC_CACHE, OMP_TARGET_UPDATE), 0) + TREE_OPERAND (TREE_RANGE_CHECK (NODE, OACC_CACHE, OMP_TARGET_EXIT_DATA), 0) #define OACC_PARALLEL_BODY(NODE) \ TREE_OPERAND (OACC_PARALLEL_CHECK (NODE), 0) @@ -1366,6 +1366,12 @@ extern void protected_set_expr_location #define OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ(NODE) \ (OMP_CLAUSE_CHECK (NODE))-omp_clause.gimple_reduction_init +/* True on a SHARED clause if a FIRSTPRIVATE clause for the same + decl is present in the chain (this can happen only for taskloop + with FIRSTPRIVATE/LASTPRIVATE on it originally. */ +#define OMP_CLAUSE_SHARED_FIRSTPRIVATE(NODE) \ + (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_SHARED)-base.public_flag) + #define OMP_CLAUSE_FINAL_EXPR(NODE) \ OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_FINAL), 0) #define OMP_CLAUSE_IF_EXPR(NODE) \ --- gcc/gimplify.c.jj 2015-05-19 19:02:52.230632257 +0200 +++ gcc/gimplify.c 2015-05-20 19:07:01.317440243 +0200 @@ -6167,6 +6167,12 @@ gimplify_scan_omp_clauses (tree *list_p, (splay_tree_key) decl) == NULL) omp_add_variable (outer_ctx, decl, GOVD_SHARED | GOVD_SEEN); else if (outer_ctx + (outer_ctx-region_type ORT_TASK) != 0 + outer_ctx-combined_loop + splay_tree_lookup (outer_ctx-variables, +(splay_tree_key) decl) == NULL) +
[Bug target/26702] .size is not emitted for BSS variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26702 --- Comment #13 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- Author: ramana Date: Thu May 21 09:23:14 2015 New Revision: 223473 URL: https://gcc.gnu.org/viewcvs?rev=223473root=gccview=rev Log: Fix PR target/26702 For Kwok Cheung Yeung. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/arm/pr26702.c
[Bug rtl-optimization/66237] New: [6.0 regression] FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66237 Bug ID: 66237 Summary: [6.0 regression] FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE (internal compiler error) Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sch...@linux-m68k.org CC: miyuki at gcc dot gnu.org Target Milestone: --- Target: aarch64-*-* $ gcc/xgcc -Bgcc/ ../gcc/testsuite/gcc.dg/tree-prof/pr34999.c -O2 -freorder-blocks-and-partition -fprofile-generate -D_PROFILE_GENERATE -lm -o pr34999.x01 $ ./pr34999.x01 $ gcc/xgcc -Bgcc/ ../gcc/testsuite/gcc.dg/tree-prof/pr34999.c -O2 -freorder-blocks-and-partition -fprofile-use -D_PROFILE_USE -lm -o pr34999.x02 ../gcc/testsuite/gcc.dg/tree-prof/pr34999.c: In function ‘main’: ../gcc/testsuite/gcc.dg/tree-prof/pr34999.c:44:1: internal compiler error: in as_a, at is-a.h:192 } ^ 0xe72e5f rtx_jump_insn* as_artx_jump_insn*, rtx_insn(rtx_insn*) ../../gcc/is-a.h:192 0xe72e5f fix_crossing_conditional_branches ../../gcc/bb-reorder.c:2047 0xe72e5f execute ../../gcc/bb-reorder.c:2742 f9a00e9e5f0f056b558f8615e3c030d37923ee72 is the first bad commit commit f9a00e9e5f0f056b558f8615e3c030d37923ee72 Author: miyuki miyuki@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed May 20 19:39:42 2015 + Promote types of RTL expressions to more derived ones.
[Bug rtl-optimization/66236] [6 Regression] FAIL: gcc.c-torture/execute/pr42691.c on alpha-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66236 Uroš Bizjak ubizjak at gmail dot com changed: What|Removed |Added CC||law at gcc dot gnu.org, ||thopre01 at gcc dot gnu.org --- Comment #1 from Uroš Bizjak ubizjak at gmail dot com --- Caused by r223113. This problem can be triggered by a crosscompiler to alpha-linux-gnu.
[Bug target/66235] [SH] Optimize tst reg,const movrt sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66235 --- Comment #1 from Oleg Endo olegendo at gcc dot gnu.org --- This is actually a special case of PR 65250.
[Bug target/65979] Multiple issues in conftest.c prevent build on sh4-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65979 --- Comment #19 from Oleg Endo olegendo at gcc dot gnu.org --- (In reply to Oleg Endo from comment #18) Yes, that is true. However, because op0, op1, op2 are all arith_reg_dest the peephole will only match if those are GP regs. Each captured insn will only reference a single GP reg, because DImode moves should have been smashed into SImode moves before the peephole2 pass. Thus, I think it should be safe to just force the mode of op0 to SImode. I'll try it out. The following seems to work OK and I'd propose this as a fix for the problem: Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 223416) +++ gcc/config/sh/sh.md (working copy) @@ -14721,7 +14721,11 @@ || REGNO (operands[2]) == REGNO (operands[5])) [(const_int 0)] { - sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[3])); + if (REGNO (operands[1]) == REGNO (operands[2])) + operands[2] = gen_rtx_REG (SImode, REGNO (operands[0])); + + sh_check_add_incdec_notes (emit_insn (gen_rtx_SET (operands[2], +operands[3]))); emit_insn (gen_tstsi_t (operands[2], gen_rtx_REG (SImode, (REGNO (operands[1]); }) @@ -14748,7 +14752,8 @@ || REGNO (operands[2]) == REGNO (operands[5])) [(const_int 0)] { - sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[3])); + sh_check_add_incdec_notes (emit_insn (gen_rtx_SET (operands[2], +operands[3]))); emit_insn (gen_tstsi_t (operands[2], gen_rtx_REG (SImode, (REGNO (operands[1]); }) Could you guys please test this patch? Actually, now it looks quite obvious I think.
[Bug target/65937] FAIL: gcc.target/arm/pr26702.c scan-assembler \\.size[\\t ]+static_foo, 4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65937 Ramana Radhakrishnan ramana at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||ramana at gcc dot gnu.org Resolution|--- |FIXED Target Milestone|--- |6.0 --- Comment #1 from Ramana Radhakrishnan ramana at gcc dot gnu.org --- fixed.
RE: [PATCH, ping 1] Move insns without introducing new temporaries in loop2_invariant
Hello! From: Jeff Law [mailto:l...@redhat.com] Sent: Wednesday, May 13, 2015 4:05 AM OK for the trunk. Thanks for your patience, Thanks. Committed with the added PR rtl-optimization/64616 to both ChangeLog entries. This patch caused PR66236 [1]. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66236 Uros.
[Bug rtl-optimization/66207] Switch alpha to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66207 --- Comment #6 from Uroš Bizjak ubizjak at gmail dot com --- (In reply to Uroš Bizjak from comment #5) (In reply to Uroš Bizjak from comment #4) Native bootstrap with alphaev68-linux-gnu (a BWX architecture) with the patch from Comment #1 succeeded, the testresults are at [1]. Comparing to non-LRA testsuite run, here is only one new test failure in the entire testsuite: No, this failure is not RA related. - PR66236. So, LRA testresults are clean on alphaev68-linux-gnu.
[Bug c/66230] Using optimizations causes program to segfault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230 --- Comment #3 from Markus Trippelsdorf trippels at gcc dot gnu.org --- Another thing you might try is to use: -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations (as per http://gcc.gnu.org/bugs/) and see if the issue goes away, too.
[Bug tree-optimization/66163] [6 Regression] Not working Firefox built with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66163 --- Comment #10 from Martin Liška marxin at gcc dot gnu.org --- Firefox developers just fixed first half of problem seen by null sanitizer and I would still wait for fixing the rest: https://bugzilla.mozilla.org/show_bug.cgi?id=1167119. Looks fixed issues are not sufficient to successfully run Firefox with LTO, let's wait for fixing the rest. Martin
[Bug libstdc++/63345] Multiple undefined behaviors (static_cast) in libstdc++-v3/include/bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63345 --- Comment #7 from Jonathan Wakely redi at gcc dot gnu.org --- Why does your patch need to touch operator* or operator- for any of the iterators? For any dereferenceable iterator the cast should be valid, so if you're seeing invalid casts it suggests that you are dereferencing invalid iterators.
[Bug rtl-optimization/66236] [6 Regression] FAIL: gcc.c-torture/execute/pr42691.c on alpha-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66236 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |6.0
Re: Add statistics to alias.c
On Thu, 21 May 2015, Jan Hubicka wrote: Hi, this patch extends statistics from tree-ssa-alias to also cover TBAA oracle. This is useful to keep track of aliasing effectivity. For example the hack in alias.c putting globbing all pointers to one costs about 20% of all answers on firefox. I.e. from 15500978 disambiguations/23744267 querries (with the hack removed) to 12932078 disambiguations/27256455 querries. Bootstrapped x86_64-linux, OK? Ok with the spelling fix and the same_type_for_tbaa hunk. Thanks, Richard. Honza * alias.c (alias_stats): New static var. (alias_sets_conflict_p, alias_sets_must_conflict_p): Update stats. (dump_alias_stats_in_alias_c): New function. * alias.h (dump_alias_stats_in_alias_c): Declare. * tree-ssa-alias.c (dump_alias_stats): Call it. Index: alias.c === --- alias.c (revision 223444) +++ alias.c (working copy) @@ -213,6 +213,19 @@ static int write_dependence_p (const_rtx static void memory_modified_1 (rtx, const_rtx, void *); +/* Query statistics for the different low-level disambiguators. + A high-level query may trigger multiple of them. */ + +static struct { + unsigned long long num_alias_zero; + unsigned long long num_same_alias_set; + unsigned long long num_same_objects; + unsigned long long num_volatile; + unsigned long long num_dag; + unsigned long long num_disambiguated; +} alias_stats; + + /* Set up all info needed to perform alias analysis on memory references. */ /* Returns the size in bytes of the mode of X. */ @@ -471,13 +484,20 @@ alias_sets_conflict_p (alias_set_type se ase = get_alias_set_entry (set1); if (ase != 0 ase-children-get (set2)) -return 1; +{ + ++alias_stats.num_dag; + return 1; +} /* Now do the same, but with the alias sets reversed. */ ase = get_alias_set_entry (set2); if (ase != 0 ase-children-get (set1)) -return 1; +{ + ++alias_stats.num_dag; + return 1; +} + ++alias_stats.num_disambiguated; /* The two alias sets are distinct and neither one is the child of the other. Therefore, they cannot conflict. */ @@ -489,8 +509,16 @@ alias_sets_conflict_p (alias_set_type se int alias_sets_must_conflict_p (alias_set_type set1, alias_set_type set2) { - if (set1 == 0 || set2 == 0 || set1 == set2) -return 1; + if (set1 == 0 || set2 == 0) +{ + ++alias_stats.num_alias_zero; + return 1; +} + if (set1 == set2) +{ + ++alias_stats.num_same_alias_set; + return 1; +} return 0; } @@ -512,10 +540,17 @@ objects_must_conflict_p (tree t1, tree t return 0; /* If they are the same type, they must conflict. */ - if (t1 == t2 - /* Likewise if both are volatile. */ - || (t1 != 0 TYPE_VOLATILE (t1) t2 != 0 TYPE_VOLATILE (t2))) -return 1; + if (t1 == t2) +{ + ++alias_stats.num_same_objects; + return 1; +} + /* Likewise if both are volatile. */ + if (t1 != 0 TYPE_VOLATILE (t1) t2 != 0 TYPE_VOLATILE (t2)) +{ + ++alias_stats.num_volatile; + return 1; +} set1 = t1 ? get_alias_set (t1) : 0; set2 = t2 ? get_alias_set (t2) : 0; @@ -3043,4 +3051,21 @@ end_alias_analysis (void) sbitmap_free (reg_known_equiv_p); } +void +dump_alias_stats_in_alias_c (FILE *s) +{ + fprintf (s, TBAA oracle: %llu disambiguations %llu queries\n + %llu are in alias set 0\n + %llu queries asked about the same object\n + %llu quaries asked about the same alias set\n + %llu access volatile\n + %llu are dependent in the DAG\n, +alias_stats.num_disambiguated, +alias_stats.num_alias_zero + alias_stats.num_same_alias_set ++ alias_stats.num_same_objects + alias_stats.num_volatile ++ alias_stats.num_dag, +alias_stats.num_alias_zero, alias_stats.num_same_alias_set, ++ alias_stats.num_same_objects, alias_stats.num_volatile, ++ alias_stats.num_dag); +} #include gt-alias.h Index: alias.h === --- alias.h (revision 223444) +++ alias.h (working copy) @@ -41,6 +41,7 @@ extern int alias_sets_conflict_p (alias_ extern int alias_sets_must_conflict_p (alias_set_type, alias_set_type); extern int objects_must_conflict_p (tree, tree); extern int nonoverlapping_memrefs_p (const_rtx, const_rtx, bool); +extern void dump_alias_stats_in_alias_c (FILE *s); tree reference_alias_ptr_type (tree); bool alias_ptr_types_compatible_p (tree, tree); Index: tree-ssa-alias.c === --- tree-ssa-alias.c (revision 223444) +++ tree-ssa-alias.c
Re: Ping ** 0.5 patch, fortran] Inline matmul with conjugate complex numbers
Le 21/05/2015 19:51, Thomas Koenig a écrit : Am 18.05.2015 um 00:05 schrieb Thomas Koenig: this patch extends the inline matmul functionality to conjugate complex numbers. Regression-tested. OK for trunk? OK (with the trivial change in the follow-up e-mail)? I'd like to start extending this to TRANSPOSE(CONJG(A)) :-) Thomas There is little that is specific to conjg (any elemental function would work roughly the same), but anyway, the patch is OK. Mikael
Re: [c++std-parallel-1632] Re: Compilers and RCU readers: Once more unto the breach!
On Thu, May 21, 2015 at 06:17:43PM +0200, Michael Matz wrote: Hi, On Thu, 21 May 2015, Paul E. McKenney wrote: The point is -exactly- to codify the current state of affairs. Ah, I see, so it's not yet about creating a more useful (for compilers, that is) model. There are several approaches being considered for that as well, but we do need to codify current usage. char * fancy_assign (char *in) { return in; } ... char *x, *y; x = atomic_load_explicit(p, memory_order_consume); y = fancy_assign (x); atomic_store_explicit(q, y, memory_order_relaxed); So, is there, or is there not a dependency carried from x to y in your proposed model (and which rule in your document states so)? Clearly, without any other language the compiler would have to assume that there is (because the equivalent 'y = x' assignment would carry the dependency). The dependency is not carried, though this is due to the current set of rules not covering atomic loads and stores, which I need to fix. Okay, so with the current regime(s), the dependency carries ... Yes, that is the intent. o Rule 14 says that if a value is part of a dependency chain and is used as the actual parameter of a function call, then the dependency chain extends to the corresponding formal parameter, namely in of fancy_assign(). o Rule 15 says that if a value is part of a dependency chain and is returned from a function, then the dependency chain extends to the returned value in the calling function. o And you are right. I need to make the first and second rules cover the relaxed atomic operations, or at least atomic loads and stores. Not that this is an issue for existing Linux-kernel code. But given such a change, the new version of rule 2 would extend the dependency chain to cover the atomic_store_explicit(). ... (if this detail would be fixed). Okay, that's quite awful ... If it has to assume this, then the whole model is not going to work very well, as usual with models that assume a certain less-optimal fact (carries-dep is less optimal for code generation purposes that not-carries-dep) unless very specific circumstances say it can be ignored. Although that is a good general rule of thumb, I do not believe that it applies to this situation, with the exception that I do indeed assume that no one is insane enough to do value-speculation optimizations for non-NULL values on loads from pointers. So what am I missing here? ... because you are then missing that if carries-dep can flow through function calls from arguments to return values by default, the compiler has to assume this in fact always happens when it can't see the function body, or can't analyze it. In effect that's making the whole carries-dep stops at these and those uses a useless excercise because a malicious user (malicious in the sense of abusing the model to show that it's hindering optimizations), i.e. me, can hide all such carries-dep stopping effects inside a function, et voila, the dependecy carries through. So for a slightly more simple example: extern void *foo (void *); // body not available x = load y = foo (x); store (y) the compiler has to assume that there's a dep-chain from x to y; always. Yes, the compiler does have to make this assumption. And the intent behind the rules is to ensure that this assumption does not get in the way of reasonable optimizations. So although I am sure that you are as busy as the rest of us, I really do need you to go through the rules in detail before you get too much more excited about this. What's worse, it also has to assume a carries-dep for this: extern void foo (void *in, void **out1, void **out2); x = load foo (x, o1, o2); store (o1); store (o2); Now the compiler has to assume that the body of 'foo' is just mean enough to make the dep-chain carry from in to *out1 or *out2 (i.e. it has to assume that for both). This extends to _all_ memory accessible from foo's body, i.e. generally all global and all local address-taken variables, so as soon as you have a function call into which a dep-chain value flows you're creating a dep-chain extension from that value to each and every global piece of memory, because the compiler cannot assume that the black box called foo is not mean. This could conceivably be stopped by making normal stores not to carry the dependency; then only the return value might be infected; but I don't see that in your rules, as a normal store is just an assigment in your model and hence rules 1 and 2 apply (that is, carries-dep flows through all assignments, incl. loads and stores). Basically whenever you can construct black boxes for the compiler, you have to limit their effects on such transitive relations like carries-dep by default,
Re: [AArch64][TLSLE][4/N] Recognize -mtls-size
Jiong Wang writes: This patch add -mtls-size option for AArch64. This option let user to do finer control on code generation for various TLS model on AArch64. For example, for TLS LE, user can specify smaller tls-size, for example 4K which is quite usual, to let AArch64 backend generate more efficient instruction sequences. Currently, -mtls-size accept all integer, then will translate it into 12(4K), 24(16M), 32(4G), 48(256TB) based on the value. no functional change. ok for trunk? 2015-05-20 Jiong Wang jiong.w...@arm.com gcc/ * config/aarch64/aarch64.opt (mtls-size): New entry. * config/aarch64/aarch64.c (initialize_aarch64_tls_size): New function. * doc/invoke.texi (AArch64 Options): Document -mtls-size. Rename summary from 5/N to 4/N. The fourth patch was a binutils patch at: https://sourceware.org/ml/binutils/2015-05/msg00181.html -- Regards, Jiong
Re: [RFC] Combine related fail of gcc.target/powerpc/ti_math1.c
On 05/21/2015 11:44 AM, Segher Boessenkool wrote: On Thu, May 21, 2015 at 11:34:14AM -0700, Richard Henderson wrote: Actually, I believe that the way CA is modeled at the moment is dangerous. It's not a 64-bit value, but a 1-bit value. It's a fixed register and it is only ever set to 0 or 1. There are more targets that do such things, and it is safe. Old Cygnus proverb: Lie to the compiler and it will always bite you in the end. I've tried with BImode before, with two effects: 1) the patterns become much more unmanageable; and 2) the optimisers do a lousy job on it. BImode isn't so well supported. Really? Zero-extending from BImode should be no different than from SImode, and we handle that all the time. Let's wait for Alan's patch that makes combine not reorder things unnecessarily, that should take care of it all as far as I see. I remain skeptical, but I'm also willing to let someone else worry about it. ;-) r~
Ping ** 0.5 patch, fortran] Inline matmul with conjugate complex numbers
Am 18.05.2015 um 00:05 schrieb Thomas Koenig: this patch extends the inline matmul functionality to conjugate complex numbers. Regression-tested. OK for trunk? OK (with the trivial change in the follow-up e-mail)? I'd like to start extending this to TRANSPOSE(CONJG(A)) :-) Thomas
Re: C/C++ PATCH to allow deprecating enum values (PR c/47043)
On 05/07/2015 12:22 PM, Marek Polacek wrote: - mark_used (decl); + mark_used (decl, 0); This should use tf_none rather than 0. + build_enumerator (DECL_NAME (decl), value, newtag, + DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl)); This is assuming that enumerators can't have dependent attributes. I guess that's currently true, but please add a comment about it. OK with those changes. Jason
Re: [PATCH] PR target/66232: -fPIC -fno-plt -mx32 fails to generate indirect branch via GOT
On Thu, May 21, 2015 at 8:36 PM, H.J. Lu hjl.to...@gmail.com wrote: I am testing this now. I will check it in if there is no regression. OK. Maybe just : ;; Return true if OP is a GOT memory operand. (define_predicate GOT_memory_operand (match_operand 0 memory_operand) { return CONSTANT_P ... GET_CODE (...) == UNSPEC XINT (...) } No point in using match_test in the above pattern. Uros.
Re: [PATCH] Fix memory orders description in atomic ops built-ins docs.
On Thu, 2015-05-21 at 16:45 +0100, Matthew Wahab wrote: On 19/05/15 20:20, Torvald Riegel wrote: On Mon, 2015-05-18 at 17:36 +0100, Matthew Wahab wrote: Hello, On 15/05/15 17:22, Torvald Riegel wrote: This patch improves the documentation of the built-ins for atomic operations. The memory model to memory order change does improve things but I think that the patch has some problems. As it is now, it makes some of the descriptions quite difficult to understand and seems to assume more familiarity with details of the C++11 specification then might be expected. I'd say that's a side effect of the C++11 memory model being the reference specification of the built-ins. Generally, the memory order descriptions seem to be targeted towards language designers but don't provide for anybody trying to understand how to implement or to use the built-ins. I agree that the current descriptions aren't a tutorial on the C++11 memory model. However, given that the model is not GCC-specific, we aren't really in a need to provide a tutorial, in the same way that we don't provide a C++ tutorial. Users can pick the C++11 memory model educational material of their choice, and we need to document what's missing to apply the C++11 knowledge to the built-ins we provide. We seem to have different views about the purpose of the manual page. I'm treating it as a description of the built-in functions provided by gcc to generate the code needed to implement the C++11 model. That is, the built-ins are distinct from C++11 and their descriptions should be, as far as possible, independent of the methods used in the C++11 specification to describe the C++11 memory model. OK. But we'd need a *precise* specification of what they do if we'd want to make them separate from the C++11 memory model. And we don't have that, would you agree? It's also not a trivial task, so I wouldn't be optimistic that someone would offer to write such a specification, and have it cross-checked. I understand of course that the __atomics were added in order to support C++11 but that doesn't make them part of C++11 and, since __atomic functions can be made available when C11/C++11 may not be, it seems to make sense to try for stand-alone descriptions. The compiler can very well provide the C++11 *memory model* without creating any dependency on the other language/library pieces of C++11 or C11. Prior to C++11, multi-threaded executions were not defined by the standard, so we're not conflicting with anything in prior language standards, right? Another way to see this is to say that we just *copy* the C++11 memory model and use it as the memory model that specifies the behavior of the atomic built-ins. That additionally frees us from having to come up with and maintain our GCC-specific specification of atomics and a memory model. I'm also concerned that the patch, by describing things in terms of formal C++11 concepts, makes it more difficult for people to know what the built-ins can be expected to do and so make the built-in more difficult to use There is a danger that rather than take a risk with uncertainty about the behaviour of the __atomics, people will fall-back to the __sync functions simply because their expected behaviour is easier to work out. I hadn't thought about that possible danger, but that would be right. The way I would prefer to counter that is that we add a big fat warning to the __sync built-ins that we don't have a precise specification for them and that there are several corners of hand-waving and potentially further issues, and that this is another reason to prefer the __atomic built-ins. PR 65697 etc. are enough indication for me that we indeed lack a proper specification. I don't think that linking to external sites will help either, unless people already want to know C++11. Anybody who just wants to (e.g.) add a memory barrier will take one look at the __sync manual page and use the closest match from there instead. Well, just wants to add a memory barrier is a the start of the problem. The same way one needs to understand a hardware memory model to pick the right HW instruction(s), the same one needs to understand a programming language memory model to pick a fence and understand its semantics. Note that none of this requires a tutorial of any kind. I'm just suggesting that the manual should describe what behaviour should be expected of the code generated for the functions. For the memory orders, that would mean describing what constraints need to be met by the generated code. I'd bet that if one describes these constraints correctly, you'll get a large document -- even if one removes any introductory or explanatory parts that could make it a tutorial. It's fairly straight-forward to describe several simple usage patterns of the atomics (e.g., seq-cst ones, simple acquire/release
Re: [PATCH] PR target/66232: -fPIC -fno-plt -mx32 fails to generate indirect branch via GOT
On Thu, May 21, 2015 at 11:41 AM, Richard Henderson r...@redhat.com wrote: On 05/21/2015 05:59 AM, H.J. Lu wrote: +(define_predicate x32_sibcall_memory_operand + (and (match_operand 0 memory_operand) + (match_test CONSTANT_P (XEXP (op, 0))) + (match_test GET_CODE (XEXP (XEXP (op, 0), 0)) == UNSPEC) + (match_test XINT (XEXP (XEXP (op, 0), 0), 1) == UNSPEC_GOTPCREL))) CONSTANT_P doesn't do what you think it does. That accepts all constants, not the CONST rtx code, which is the only thing you want to be looking into its XEXP. Hope this is the final one :-). Thanks. -- H.J. --- From 66ca08e9fb7456d7be7d48825f2a1c40b777f657 Mon Sep 17 00:00:00 2001 From: H.J. Lu hjl.to...@gmail.com Date: Thu, 21 May 2015 05:50:14 -0700 Subject: [PATCH] Allow indirect branch via GOT slot for x32 X32 doesn't support indirect branch via 32-bit memory slot since indirect branch will load 64-bit address from 64-bit memory slot. Since x32 GOT slot is 64-bit, we should allow indirect branch via GOT slot for x32. gcc/ PR target/66232 * config/i386/constraints.md (Bg): New constraint for GOT memory operand. * config/i386/i386.md (*call_got_x32): New pattern. (*call_value_got_x32): Likewise. * config/i386/predicates.md (GOT_memory_operand): New predicate. gcc/testsuite/ PR target/66232 * gcc.target/i386/pr66232-1.c: New test. * gcc.target/i386/pr66232-2.c: Likewise. * gcc.target/i386/pr66232-3.c: Likewise. * gcc.target/i386/pr66232-4.c: Likewise. * gcc.target/i386/pr66232-5.c: Likewise. --- gcc/config/i386/constraints.md| 5 + gcc/config/i386/i386.md | 20 gcc/config/i386/predicates.md | 10 ++ gcc/testsuite/gcc.target/i386/pr66232-1.c | 13 + gcc/testsuite/gcc.target/i386/pr66232-2.c | 14 ++ gcc/testsuite/gcc.target/i386/pr66232-3.c | 13 + gcc/testsuite/gcc.target/i386/pr66232-4.c | 13 + gcc/testsuite/gcc.target/i386/pr66232-5.c | 16 8 files changed, 104 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-5.c diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index 2271bd1..c718bc1 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -146,10 +146,15 @@ @internal Lower SSE register when avoiding REX prefix and all SSE registers otherwise.) ;; We use the B prefix to denote any number of internal operands: +;; g GOT memory operand. ;; s Sibcall memory operand, not valid for TARGET_X32 ;; w Call memory operand, not valid for TARGET_X32 ;; z Constant call address operand. +(define_constraint Bg + @internal GOT memory operand. + (match_operand 0 GOT_memory_operand)) + (define_constraint Bs @internal Sibcall memory operand. (and (not (match_test TARGET_X32)) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index aefca43..3819dfd 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11659,6 +11659,15 @@ * return ix86_output_call_insn (insn, operands[0]); [(set_attr type call)]) +;; This covers both call and sibcall since only GOT slot is allowed. +(define_insn *call_got_x32 + [(call (mem:QI (zero_extend:DI + (match_operand:SI 0 GOT_memory_operand Bg))) + (match_operand 1))] + TARGET_X32 + * return ix86_output_call_insn (insn, operands[0]); + [(set_attr type call)]) + (define_insn *sibcall [(call (mem:QI (match_operand:W 0 sibcall_insn_operand UBsBz)) (match_operand 1))] @@ -11825,6 +11834,17 @@ * return ix86_output_call_insn (insn, operands[1]); [(set_attr type callv)]) +;; This covers both call and sibcall since only GOT slot is allowed. +(define_insn *call_value_got_x32 + [(set (match_operand 0) + (call (mem:QI + (zero_extend:DI + (match_operand:SI 1 GOT_memory_operand Bg))) + (match_operand 2)))] + TARGET_X32 + * return ix86_output_call_insn (insn, operands[1]); + [(set_attr type callv)]) + (define_insn *sibcall_value [(set (match_operand 0) (call (mem:QI (match_operand:W 1 sibcall_insn_operand UBsBz)) diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 26dd3e1..6d6c6c4 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -606,6 +606,16 @@ (and (not (match_test TARGET_X32)) (match_operand 0 sibcall_memory_operand +;; Return true if OP is a GOT memory operand. +(define_predicate GOT_memory_operand + (match_operand 0 memory_operand) +{ + op = XEXP (op, 0); + return (GET_CODE (op) == CONST + GET_CODE (XEXP (op), 0)) == UNSPEC + XINT (XEXP (op, 0), 1)) == UNSPEC_GOTPCREL); +}) + ;; Match exactly zero. (define_predicate const0_operand
[Bug fortran/66176] Handle conjg() in inline matmul
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66176 --- Comment #1 from Thomas Koenig tkoenig at gcc dot gnu.org --- Author: tkoenig Date: Thu May 21 19:00:45 2015 New Revision: 223499 URL: https://gcc.gnu.org/viewcvs?rev=223499root=gccview=rev Log: 2015-05-21 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/66176 * frontend-passes.c (check_conjg_variable): New function. (inline_matmul_assign): Use it to keep track of conjugated variables. 2015-05-21 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/66176 * gfortran.dg/inline_matmul_11.f90: New test Added: trunk/gcc/testsuite/gfortran.dg/inline_matmul_11.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/frontend-passes.c trunk/gcc/testsuite/ChangeLog
[Bug middle-end/66241] [6 regression] [ARM] ICE: verify_type failed while building libstdc++ (dwarfout.c: gen_type_die_with_usage())
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66241 --- Comment #3 from Vidya Praveen vp at gcc dot gnu.org --- And this change seems to be the cause: Author: hubicka Date: Sat May 16 20:51:50 2015 New Revision: 223252 URL: https://gcc.gnu.org/viewcvs?rev=223252root=gccview=rev Log: * tree.c (verify_type_variant): Verify tree_base and type_common flags. (verify_type): Verify STRING_FLAG. Modified: trunk/gcc/ChangeLog trunk/gcc/tree.c
Calculate TYPE_CANONICAL only for types that can be accessed in memory
Hi, this is next part of the series. It disables canonical type calculation for incomplete types with exception of arrays based on claim that we do not have good notion of those. I can botostrap this with additional checks in alias.c that canonical types are always present with LTO but I need fix to ICF that compare alias sets of types it does not need to and trips incomplete types otherwise. I will push out these fixes separately and incrementally add the fix. The purpose of those checks is to avoid alias.c degenerating to structural equality path for no good reason. I tried the alternative to disable it on ARRAY_TYPES too and add avoid recursion to those for fields. THis does not fly because we can have ARRAY_REFS of incomplete types: array_ref 0x769c2968 type pointer_type 0x76942150 type integer_type 0x76cd50a8 char readonly string-flag QI size integer_cst 0x76ad7ca8 constant 8 unit size integer_cst 0x76ad7cc0 constant 1 align 8 symtab -158253232 alias set 0 canonical type 0x76adb498 precision 8 min integer_cst 0x76cd2090 -128 max integer_cst 0x76cd2078 127 pointer_to_this pointer_type 0x76cd5150 readonly unsigned DI size integer_cst 0x76ad7bb8 constant 64 unit size integer_cst 0x76ad7bd0 constant 8 align 64 symtab 0 alias set -1 canonical type 0x76af37e0 pointer_to_this pointer_type 0x769479d8 readonly arg 0 mem_ref 0x769d1028 type array_type 0x7694b348 type pointer_type 0x76942150 BLK align 64 symtab 0 alias set -1 structural equality pointer_to_this pointer_type 0x7694b3f0 arg 0 addr_expr 0x769ce380 type pointer_type 0x7694b3f0 constant arg 0 var_decl 0x76941480 reg_note_name arg 1 integer_cst 0x769b6678 constant 0 arg 1 ssa_name 0x769c5630 type integer_type 0x76adb690 int asm_written public SI size integer_cst 0x76ad7df8 constant 32 unit size integer_cst 0x76ad7e10 constant 4 align 32 symtab -158421968 alias set 3 canonical type 0x76adb690 precision 32 min integer_cst 0x76ad7db0 -2147483648 max integer_cst 0x76ad7dc8 2147483647 pointer_to_this pointer_type 0x76af37e0 reference_to_this reference_type 0x76942b28 visiteddef_stmt _103 = (int) _101; version 103 ptr-info 0x769f04a0 ../../gcc/print-rtl.c:173:4 and we compute alias set for it via: #0 internal_error (gmsgid=0x1b86c8f in %s, at %s:%d) at ../../gcc/diagnostic.c:1271 #1 0x015e2416 in fancy_abort (file=0x167ea2a ../../gcc/alias.c, line=823, function=0x167f7d6 get_alias_set(tree_node*)::__FUNCTION__ get_alias_set) at ../../gcc/diagnostic.c:1341 #2 0x007109b9 in get_alias_set (t=0x7694b2a0) at ../../gcc/alias.c:823 #3 0x0070fecf in component_uses_parent_alias_set_from (t=0x769c2968) at ../../gcc/alias.c:607 #4 0x00710497 in reference_alias_ptr_type_1 (t=0x7fffe068) at ../../gcc/alias.c:719 #5 0x007107e8 in get_alias_set (t=0x769c2968) at ../../gcc/alias.c:799 #6 0x00ebca97 in vn_reference_lookup (op=0x769c2968, vuse=0x769ca798, kind=VN_WALKREWRITE, vnresult=0x0) at ../../gcc/tree-ssa-sccvn.c:2217 #7 0x00ebea99 in visit_reference_op_load (lhs=0x769c5678, op=0x769c2968, stmt=0x769cf730) at ../../gcc/tree-ssa-sccvn.c:3030 #8 0x00ec05ec in visit_use (use=0x769c5678) at ../../gcc/tree-ssa-sccvn.c:3685 #9 0x00ec1047 in process_scc (scc=...) at ../../gcc/tree-ssa-sccvn.c:3927 #10 0x00ec1679 in extract_and_process_scc_for_name (name=0x769c5678) at ../../gcc/tree-ssa-sccvn.c:4013 #11 0x00ec1848 in DFS (name=0x769c5678) at ../../gcc/tree-ssa-sccvn.c:4065 #12 0x00ec26d1 in cond_dom_walker::before_dom_children (this=0x7fffe5a0, bb=0x769b9888) at ../../gcc/tree-ssa-sccvn.c:4345 #13 0x014c05c0 in dom_walker::walk (this=0x7fffe5a0, bb=0x769b9888) at ../../gcc/domwalk.c:188 #14 0x00ec2b0e in run_scc_vn (default_vn_walk_kind_=VN_WALKREWRITE) at ../../gcc/tree-ssa-sccvn.c:4436 #15 0x00e98d59 in (anonymous namespace)::pass_fre::execute (this=0x1f621b0, fun=0x7698db28) at ../../gcc/tree-ssa-pre.c:4972 #16 0x00bb6c8f in execute_one_pass (pass=0x1f621b0) at ../../gcc/passes.c:2317 #17 0x00bb6ede in execute_pass_list_1 (pass=0x1f621b0) at ../../gcc/passes.c:2370 #18 0x00bb6f0f in execute_pass_list_1 (pass=0x1f61d90) at ../../gcc/passes.c:2371 #19 0x00bb6f51 in execute_pass_list (fn=0x7698db28, pass=0x1f61cd0) at ../../gcc/passes.c:2381 #20 0x007bb3f6 in cgraph_node::expand (this=0x7695b000) at ../../gcc/cgraphunit.c:1895 #21 0x007bba15 in expand_all_functions () at
Re: [patch] testsuite enable PIE tests on FreeBSD
On 20.05.15 22:30, Jeff Law wrote: On 05/20/2015 11:04 AM, Andreas Tobler wrote: Hi, the attached patch enables some PIE tests on FreeBSD. Ok for trunk? Thanks, Andreas 2015-05-20 Andreas Tobler andre...@gcc.gnu.org * gcc.target/i386/pr32219-1.c: Enable test on FreeBSD. * gcc.target/i386/pr32219-2.c: Likewise. * gcc.target/i386/pr32219-3.c: Likewise. * gcc.target/i386/pr32219-4.c: Likewise. * gcc.target/i386/pr32219-5.c: Likewise. * gcc.target/i386/pr32219-6.c: Likewise * gcc.target/i386/pr32219-7.c: Likewise. * gcc.target/i386/pr32219-8.c: Likewise. * gcc.target/i386/pr39013-1.c: Likewise. * gcc.target/i386/pr39013-2.c: Likewise. * gcc.target/i386/pr64317.c: Likewise. Wouldn't it be better to remove the target selector and instead add: /* { dg-require-effective-target pie } */ In each of those tests? While the net effect is the same today, it means there's only one place to change if another x86 target gains PIE support in the future. Pre-approved using that style. Thanks! Tested on amd64-freebsd and CentOS. Andreas This is what I committed: 2015-05-21 Andreas Tobler andre...@gcc.gnu.org * gcc.target/i386/pr32219-1.c: Use 'dg-require-effective-target pie' instead of listing several targets on its own. * gcc.target/i386/pr32219-2.c: Likewise. * gcc.target/i386/pr32219-3.c: Likewise. * gcc.target/i386/pr32219-4.c: Likewise. * gcc.target/i386/pr32219-5.c: Likewise. * gcc.target/i386/pr32219-6.c: Likewise * gcc.target/i386/pr32219-7.c: Likewise. * gcc.target/i386/pr32219-8.c: Likewise. * gcc.target/i386/pr39013-1.c: Likewise. * gcc.target/i386/pr39013-2.c: Likewise. * gcc.target/i386/pr64317.c: Likewise. Index: pr32219-1.c === --- pr32219-1.c (revision 223448) +++ pr32219-1.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpie } */ /* Initialized common symbol with -fpie. */ Index: pr32219-2.c === --- pr32219-2.c (revision 223448) +++ pr32219-2.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpic } */ /* Common symbol with -fpic. */ Index: pr32219-3.c === --- pr32219-3.c (revision 223448) +++ pr32219-3.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpie } */ /* Weak common symbol with -fpie. */ Index: pr32219-4.c === --- pr32219-4.c (revision 223448) +++ pr32219-4.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpic } */ /* Weak common symbol with -fpic. */ Index: pr32219-5.c === --- pr32219-5.c (revision 223448) +++ pr32219-5.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpie } */ /* Initialized symbol with -fpie. */ Index: pr32219-6.c === --- pr32219-6.c (revision 223448) +++ pr32219-6.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpic } */ /* Initialized symbol with -fpic. */ Index: pr32219-7.c === --- pr32219-7.c (revision 223448) +++ pr32219-7.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpie } */ /* Weak initialized symbol with -fpie. */ Index: pr32219-8.c === --- pr32219-8.c (revision 223448) +++ pr32219-8.c (working copy) @@ -1,4 +1,5 @@ -/* { dg-do compile { target *-*-linux* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /* { dg-options -O2 -fpic } */ /* Weak initialized symbol with -fpic. */ Index: pr39013-1.c === --- pr39013-1.c (revision 223448) +++ pr39013-1.c (working copy) @@ -1,5 +1,6 @@ /* PR target/39013 */ -/* { dg-do compile { target *-*-linux* *-*-gnu* } } */ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ /*
Re: [RFC] Combine related fail of gcc.target/powerpc/ti_math1.c
On 05/21/2015 05:39 AM, Segher Boessenkool wrote: Trying 18, 9 - 24: Failed to match this instruction: (set (reg:DI 4 4 [+8 ]) (plus:DI (plus:DI (reg:DI 5 5 [ val+8 ]) (reg:DI 76 ca)) (reg:DI 169 [+8 ]))) For some reason it has the CA reg not last. I think we should add to the canonicalisation rules so that fixed regs sort after other regs. That requires a lot of testing. Actually, I believe that the way CA is modeled at the moment is dangerous. It's not a 64-bit value, but a 1-bit value. If we rearrange the expanded rtl to be (zero_extend:DI (reg:BI CA)), then normal canonicalization rules will apply and it'll always appear first in the chain of PLUS. r~
Re: [PATCH] PR target/66232: -fPIC -fno-plt -mx32 fails to generate indirect branch via GOT
On 05/21/2015 05:59 AM, H.J. Lu wrote: +(define_predicate x32_sibcall_memory_operand + (and (match_operand 0 memory_operand) + (match_test CONSTANT_P (XEXP (op, 0))) + (match_test GET_CODE (XEXP (XEXP (op, 0), 0)) == UNSPEC) + (match_test XINT (XEXP (XEXP (op, 0), 0), 1) == UNSPEC_GOTPCREL))) CONSTANT_P doesn't do what you think it does. That accepts all constants, not the CONST rtx code, which is the only thing you want to be looking into its XEXP. r~
Re: [nvptx] Re: Mostly rewrite genrecog
On 05/21/2015 09:12 AM, Thomas Schwinge wrote: OK to commit? gcc/ * config/nvptx/nvptx.md (allocate_stack): Rename to... (allocate_stack_mode): ... this, and add :P on both match_operand and unspec. (allocate_stack): New expander. If you really want to. It doesn't work yet in ptxas so it's a little pointless to spend effort on it. Bernd
[gomp4] Vector-single predication
This uses the patch I committed yesterday which introduces warp broadcasts to implement the vector-single predication needed for OpenACC. Outside a loop with vector parallelism, only one of the threads representing a vector must execute, the others follow along. So we skip the real work in each basic block for the inactive threads, then broadcast the direction to take in the control flow graph from the active one, and jump as a group. This will get extended with similar functionality for worker-single. Julian is working on some patches on top of that to ensure the later optimizers don't destroy the control flow - we really need the threads to reconverge and perform the broadcast/jump in lockstep. Committed on gomp-4_0-branch. Bernd Index: gcc/ChangeLog.gomp === --- gcc/ChangeLog.gomp (revision 223444) +++ gcc/ChangeLog.gomp (working copy) @@ -1,5 +1,15 @@ 2015-05-20 Bernd Schmidt ber...@codesourcery.com + * omp-low.c (struct omp_region): Add a gwv_this field. + (bb_region_map): New variable. + (find_omp_for_region_data, find_omp_target_region_data): New static + functions. + (build_omp_regions_1): Call them. Build the bb_region_map. + (enclosing_target_region, requires_vector_predicate, + generate_vector_broadcast, predicate_bb, find_predicatable_bbs, + predicate_omp_regions): New static functions. + (execute_expand_omp): Allocate and free bb_region_map. + * config/nvptx/nvptx.c: Include dumpfile,h. (condition_unidirectional_p): New static function. (nvptx_print_operand): Use it for new 'U' handling. Index: gcc/omp-low.c === --- gcc/omp-low.c (revision 223442) +++ gcc/omp-low.c (working copy) @@ -159,6 +159,9 @@ struct omp_region /* True if this is a combined parallel+workshare region. */ bool is_combined_parallel; + + /* For an OpenACC loop, the level of parallelism requested. */ + int gwv_this; }; /* Levels of parallelism as defined by OpenACC. Increasing numbers @@ -9961,7 +9964,6 @@ expand_omp_target (struct omp_region *re update_ssa (TODO_update_ssa_only_virtuals); } - /* Expand the parallel region tree rooted at REGION. Expansion proceeds in depth-first order. Innermost regions are expanded first. This way, parallel regions that require a new function to @@ -9984,7 +9986,7 @@ expand_omp (struct omp_region *region) if (region-type == GIMPLE_OMP_FOR gimple_omp_for_combined_p (last_stmt (region-entry))) inner_stmt = last_stmt (region-inner-entry); - + if (region-inner) expand_omp (region-inner); @@ -10041,6 +10043,44 @@ expand_omp (struct omp_region *region) } } +/* Map each basic block to an omp_region. */ +static hash_mapbasic_block, omp_region * *bb_region_map; + +/* Fill in additional data for a region REGION associated with an + OMP_FOR STMT. */ + +static void +find_omp_for_region_data (struct omp_region *region, gimple stmt) +{ + if (!is_gimple_omp_oacc (stmt)) +return; + + tree clauses = gimple_omp_for_clauses (stmt); + if (find_omp_clause (clauses, OMP_CLAUSE_GANG)) +region-gwv_this |= MASK_GANG; + if (find_omp_clause (clauses, OMP_CLAUSE_WORKER)) +region-gwv_this |= MASK_WORKER; + if (find_omp_clause (clauses, OMP_CLAUSE_VECTOR)) +region-gwv_this |= MASK_VECTOR; +} + +/* Fill in additional data for a region REGION associated with an + OMP_TARGET STMT. */ + +static void +find_omp_target_region_data (struct omp_region *region, gimple stmt) +{ + if (!is_gimple_omp_oacc (stmt)) +return; + + tree clauses = gimple_omp_target_clauses (stmt); + if (find_omp_clause (clauses, OMP_CLAUSE_NUM_GANGS)) +region-gwv_this |= MASK_GANG; + if (find_omp_clause (clauses, OMP_CLAUSE_NUM_WORKERS)) +region-gwv_this |= MASK_WORKER; + if (find_omp_clause (clauses, OMP_CLAUSE_VECTOR_LENGTH)) +region-gwv_this |= MASK_VECTOR; +} /* Helper for build_omp_regions. Scan the dominator tree starting at block BB. PARENT is the region that contains BB. If SINGLE_TREE is @@ -10055,6 +10095,8 @@ build_omp_regions_1 (basic_block bb, str gimple stmt; basic_block son; + bb_region_map-put (bb, parent); + gsi = gsi_last_bb (bb); if (!gsi_end_p (gsi) is_gimple_omp (gsi_stmt (gsi))) { @@ -10107,6 +10149,7 @@ build_omp_regions_1 (basic_block bb, str case GF_OMP_TARGET_KIND_OACC_PARALLEL: case GF_OMP_TARGET_KIND_OACC_KERNELS: case GF_OMP_TARGET_KIND_OACC_DATA: + find_omp_target_region_data (region, stmt); break; case GF_OMP_TARGET_KIND_UPDATE: case GF_OMP_TARGET_KIND_OACC_UPDATE: @@ -10118,6 +10161,8 @@ build_omp_regions_1 (basic_block bb, str gcc_unreachable (); } } + else if (code == GIMPLE_OMP_FOR) + find_omp_for_region_data (region, stmt); /* ..., this directive becomes the parent for a new region. */ if (region) parent = region; @@ -10156,7 +10201,7 @@ omp_expand_local
[Bug c/65892] gcc fails to implement N685 aliasing of union members
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 Marek Polacek mpolacek at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |SUSPENDED Last reconfirmed||2015-05-21 CC||mpolacek at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #11 from Marek Polacek mpolacek at gcc dot gnu.org --- Suspending until then.
[Bug target/65979] [4.9/5/6 Regression] [SH] Wrong code is generated with stage1 compiler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65979 Kazumoto Kojima kkojima at gcc dot gnu.org changed: What|Removed |Added Summary|Multiple issues in |[4.9/5/6 Regression] [SH] |conftest.c prevent build on |Wrong code is generated |sh4-linux-gnu |with stage1 compiler --- Comment #22 from Kazumoto Kojima kkojima at gcc dot gnu.org --- (In reply to Oleg Endo from comment #21) OK, I'll commit your patch when all tests are done. BTW, I'd like to change the summary of the PR to clarify that this is a 4.9/5/6 regression.
Re: [patch, testsuite] don't specify dg-do run explicitly for vect test cases
On 05/20/2015 11:12 PM, Sandra Loosemore wrote: On targets such as ARM, some arches are compatible with options needed to enable compilation with vectorization, but the specific hardware (or simulator or BSP) available for execution tests may not implement or enable those features. The vect.exp test harness already includes some magic to determine whether the target hw can execute vectorized code and sets dg-do-what-default to compile the tests only if they can't be executed. It's a mistake for individual tests to explicitly say dg-do run because this overrides the harness's magic default and forces the test to be executed, even if doing so just ends up wedging the target. I already committed two patches last fall (r215627 and r218427) to address this, but people keep adding new vect test cases with the same problem, so here is yet another installment to clean them up. I tested this on arm-none-eabi with a fairly large collection of multilibs. OK to commit? -Sandra vect.log 2015-05-20 Sandra Loosemoresan...@codesourcery.com gcc/testsuite/ * gcc.dg/vect/bb-slp-pr65935.c: Remove explicit dg-do run. * gcc.dg/vect/pr59354.c: Likewise. * gcc.dg/vect/pr64252.c: Likewise. * gcc.dg/vect/pr64404.c: Likewise. * gcc.dg/vect/pr64493.c: Likewise. * gcc.dg/vect/pr64495.c: Likewise. * gcc.dg/vect/pr64844.c: Likewise. * gcc.dg/vect/pr65518.c: Likewise. * gcc.dg/vect/vect-aggressive-1.c: Likewise. OK. jeff
Re: [patch, libgomp] Re-factor GOMP_MAP_POINTER handling
Hi! Jakub, for avoidance of doubt, the proposed refactoring makes sense to me, but does need your approval: On Thu, 21 May 2015 16:30:40 +0800, Chung-Lin Tang clt...@codesourcery.com wrote: Ping x2. On 15/5/11 7:19 PM, Chung-Lin Tang wrote: Ping. On 2015/4/21 08:21 PM, Chung-Lin Tang wrote: Hi, while investigating some issues in the variable mapping code, I observed that the GOMP_MAP_POINTER handling is essentially duplicated under the PSET case. This patch abstracts and unifies the handling code, basically just a cleanup patch. Ran libgomp tests to ensure no regressions, ok for trunk? Thanks, Chung-Lin 2015-04-21 Chung-Lin Tang clt...@codesourcery.com libgomp/ * target.c (gomp_map_pointer): New function abstracting out GOMP_MAP_POINTER handling. (gomp_map_vars): Remove GOMP_MAP_POINTER handling code and use gomp_map_pointer(). Grüße, Thomas signature.asc Description: PGP signature
Re: [PATCH] PR target/66232: -fPIC -fno-plt -mx32 fails to generate indirect branch via GOT
On Thu, May 21, 2015 at 2:59 PM, H.J. Lu hjl.to...@gmail.com wrote: X32 doesn't support indirect branch via 32-bit memory slot since indirect branch will load 64-bit address from 64-bit memory slot. Since x32 GOT slot is 64-bit, we should allow indirect branch via GOT slot for x32. I am testing it on x32. OK for master if there is no regression? Thanks. H.J. -- gcc/ PR target/66232 * config/i386/constraints.md (Bg): Add a constraint for x32 call and sibcall memory operand. * config/i386/i386.md (*call_x32): New pattern. (*sibcall_x32): Likewise. (*call_value_x32): Likewise. (*sibcall_value_x32): Likewise. * config/i386/predicates.md (x32_sibcall_memory_operand): New predicate. (x32_call_insn_operand): Likewise. (x32_sibcall_insn_operand): Likewise. gcc/testsuite/ PR target/66232 * gcc.target/i386/pr66232-1.c: New test. * gcc.target/i386/pr66232-2.c: Likewise. * gcc.target/i386/pr66232-3.c: Likewise. * gcc.target/i386/pr66232-4.c: Likewise. OK. maybe you should use match_code some more in x32_sibcall_memory_operand, e.g. (match_code constant 0) (match_code unspec 00) But it is up to you, since XINT doesn't fit in this scheme... Thanks, Uros. gcc/config/i386/constraints.md| 6 ++ gcc/config/i386/i386.md | 36 +++ gcc/config/i386/predicates.md | 26 ++ gcc/testsuite/gcc.target/i386/pr66232-1.c | 13 +++ gcc/testsuite/gcc.target/i386/pr66232-2.c | 14 gcc/testsuite/gcc.target/i386/pr66232-3.c | 13 +++ gcc/testsuite/gcc.target/i386/pr66232-4.c | 13 +++ 7 files changed, 121 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr66232-4.c diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index 2271bd1..7be8917 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -146,10 +146,16 @@ @internal Lower SSE register when avoiding REX prefix and all SSE registers otherwise.) ;; We use the B prefix to denote any number of internal operands: +;; g Call and sibcall memory operand, valid for TARGET_X32 ;; s Sibcall memory operand, not valid for TARGET_X32 ;; w Call memory operand, not valid for TARGET_X32 ;; z Constant call address operand. +(define_constraint Bg + @internal Call/sibcall memory operand for x32. + (and (match_test TARGET_X32) + (match_operand 0 x32_sibcall_memory_operand))) + (define_constraint Bs @internal Sibcall memory operand. (and (not (match_test TARGET_X32)) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index aefca43..a1ae05a 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11659,6 +11659,14 @@ * return ix86_output_call_insn (insn, operands[0]); [(set_attr type call)]) +(define_insn *call_x32 + [(call (mem:QI (zero_extend:DI + (match_operand:SI 0 x32_call_insn_operand Bg))) +(match_operand 1))] + TARGET_X32 !SIBLING_CALL_P (insn) + * return ix86_output_call_insn (insn, operands[0]); + [(set_attr type call)]) + (define_insn *sibcall [(call (mem:QI (match_operand:W 0 sibcall_insn_operand UBsBz)) (match_operand 1))] @@ -11666,6 +11674,14 @@ * return ix86_output_call_insn (insn, operands[0]); [(set_attr type call)]) +(define_insn *sibcall_x32 + [(call (mem:QI (zero_extend:DI + (match_operand:SI 0 x32_sibcall_insn_operand Bg))) +(match_operand 1))] + TARGET_X32 SIBLING_CALL_P (insn) + * return ix86_output_call_insn (insn, operands[0]); + [(set_attr type call)]) + (define_insn *sibcall_memory [(call (mem:QI (match_operand:W 0 memory_operand m)) (match_operand 1)) @@ -11825,6 +11841,16 @@ * return ix86_output_call_insn (insn, operands[1]); [(set_attr type callv)]) +(define_insn *call_value_x32 + [(set (match_operand 0) + (call (mem:QI + (zero_extend:DI + (match_operand:SI 1 x32_call_insn_operand Bg))) + (match_operand 2)))] + TARGET_X32 !SIBLING_CALL_P (insn) + * return ix86_output_call_insn (insn, operands[1]); + [(set_attr type callv)]) + (define_insn *sibcall_value [(set (match_operand 0) (call (mem:QI (match_operand:W 1 sibcall_insn_operand UBsBz)) @@ -11833,6 +11859,16 @@ * return ix86_output_call_insn (insn, operands[1]); [(set_attr type callv)]) +(define_insn *sibcall_value_x32 + [(set (match_operand 0) + (call (mem:QI + (zero_extend:DI + (match_operand:SI 1 x32_sibcall_insn_operand
[Bug target/66232] -fPIC -fno-plt -mx32 fails to generate indirect branch via GOT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66232 --- Comment #2 from H.J. Lu hjl.tools at gmail dot com --- Created attachment 35585 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35585action=edit A patch I am testing this.
Re: [PATCH 6/7] remove #if HAVE_conditional_move
On 05/20/2015 08:09 PM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders tbsaunde+...@tbsaunde.org gcc/ChangeLog: 2015-05-20 Trevor Saunders tbsaunde+...@tbsaunde.org * *.c, *.h: DOn't check HAVE_conditional_move with the preprocessor. You know what I'm going to say here :-) FWIW, I think just mentioning the filename is fine for these kinds of mechanical changes -- no need to list each function that got twiddled. OK for the trunk. Jeff
Re: [PATCH 3/7] move default for STACK_PUSH_CODE to defaults.h
On 05/20/2015 08:09 PM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders tbsaunde+...@tbsaunde.org gcc/ChangeLog: 2015-05-20 Trevor Saunders tbsaunde+...@tbsaunde.org * defaults.h: Add default for STACK_PUSH_CODE. * expr.c: Don't redefine STACK_PUSH_CODE. * recog.c: Likewise. OK. jeff
[Patch]: libbacktrace - add support of PE/COFF
Hello, this patch adds basic support to libbacktrace for PE32 and PE32+ (Windows and Windows64 object formats). Support is ‘basic’ because neither DLL nor PIE (if that exists) are handled. Furthermore, there is no windows versions of mmapio.c and mmap.c Finally, I have disabled the support of data symbols for PE because I wasn’t able to pass ‘make check’ with that: symbol ‘_global’ is at the same address as a symbol defined by the linker and I haven’t found any way to discard the latter. As I think data symbol support isn’t a required feature, I have preferred to disable that feature on PE. The new file, pecoff.c, mostly follows the structure of elf.c Tested on both windows and windows64. No regression on Gnu/Linux x86. Tristan. 2015-05-21 Tristan Gingold ging...@adacore.com * pecoff.c: New file. * Makefile.am (FORMAT_FILES): Add pecoff.c and dependencies. * Makefile.in: Regenerate. * filetype.awk: Detect pecoff. * configure.ac: Define BACKTRACE_SUPPORTS_DATA on elf platforms. Add pecoff. * btest.c (test5): Test enabled only if BACKTRACE_SUPPORTS_DATA is true. * backtrace-supported.h.in (BACKTRACE_SUPPORTS_DATA): Define. * configure: Regenerate. * pecoff.c: New file. commit ac17f650356728fc07121c71213401e1e159df2f Author: Tristan Gingold ging...@adacore.com Date: Thu May 21 14:29:44 2015 +0200 Add support for PE/COFF to libbacktrace. diff --git a/libbacktrace/ChangeLog b/libbacktrace/ChangeLog index c6604d9..139521a 100644 --- a/libbacktrace/ChangeLog +++ b/libbacktrace/ChangeLog @@ -1,3 +1,17 @@ +2015-05-21 Tristan Gingold ging...@adacore.com + + * pecoff.c: New file. + * Makefile.am (FORMAT_FILES): Add pecoff.c and dependencies. + * Makefile.in: Regenerate. + * filetype.awk: Detect pecoff. + * configure.ac: Define BACKTRACE_SUPPORTS_DATA on elf platforms. + Add pecoff. + * btest.c (test5): Test enabled only if BACKTRACE_SUPPORTS_DATA is + true. + * backtrace-supported.h.in (BACKTRACE_SUPPORTS_DATA): Define. + * configure: Regenerate. + * pecoff.c: New file. + 2015-05-13 Michael Haubenwallner michael.haubenwall...@ssi-schaefer.com * Makefile.in: Regenerated with automake-1.11.6. diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am index a93b82a..c5f0dcb 100644 --- a/libbacktrace/Makefile.am +++ b/libbacktrace/Makefile.am @@ -56,6 +56,7 @@ BACKTRACE_FILES = \ FORMAT_FILES = \ elf.c \ + pecoff.c \ unknown.c VIEW_FILES = \ @@ -124,6 +125,7 @@ fileline.lo: config.h backtrace.h internal.h mmap.lo: config.h backtrace.h internal.h mmapio.lo: config.h backtrace.h internal.h nounwind.lo: config.h internal.h +pecoff.lo: config.h backtrace.h internal.h posix.lo: config.h backtrace.h internal.h print.lo: config.h backtrace.h internal.h read.lo: config.h backtrace.h internal.h diff --git a/libbacktrace/Makefile.in b/libbacktrace/Makefile.in index a949f29..b434d76e 100644 --- a/libbacktrace/Makefile.in +++ b/libbacktrace/Makefile.in @@ -299,6 +299,7 @@ BACKTRACE_FILES = \ FORMAT_FILES = \ elf.c \ + pecoff.c \ unknown.c VIEW_FILES = \ @@ -753,6 +754,7 @@ fileline.lo: config.h backtrace.h internal.h mmap.lo: config.h backtrace.h internal.h mmapio.lo: config.h backtrace.h internal.h nounwind.lo: config.h internal.h +pecoff.lo: config.h backtrace.h internal.h posix.lo: config.h backtrace.h internal.h print.lo: config.h backtrace.h internal.h read.lo: config.h backtrace.h internal.h diff --git a/libbacktrace/backtrace-supported.h.in b/libbacktrace/backtrace-supported.h.in index 5115ce1..4574635 100644 --- a/libbacktrace/backtrace-supported.h.in +++ b/libbacktrace/backtrace-supported.h.in @@ -59,3 +59,8 @@ POSSIBILITY OF SUCH DAMAGE. */ as 0. */ #define BACKTRACE_SUPPORTS_THREADS @BACKTRACE_SUPPORTS_THREADS@ + +/* BACKTRACE_SUPPORTS_DATA will be #defined'd as 1 if the backtrace library + also handles data symbols, 0 if not. */ + +#define BACKTRACE_SUPPORTS_DATA @BACKTRACE_SUPPORTS_DATA@ diff --git a/libbacktrace/btest.c b/libbacktrace/btest.c index 9424a92..9821e34 100644 --- a/libbacktrace/btest.c +++ b/libbacktrace/btest.c @@ -616,6 +616,8 @@ f33 (int f1line, int f2line) return failures; } +#if BACKTRACE_SUPPORTS_DATA + int global = 1; static int @@ -684,6 +686,8 @@ test5 (void) return failures; } +#endif /* BACKTRACE_SUPPORTS_DATA */ + static void error_callback_create (void *data ATTRIBUTE_UNUSED, const char *msg, int errnum) @@ -708,8 +712,10 @@ main (int argc ATTRIBUTE_UNUSED, char **argv) test2 (); test3 (); test4 (); +#if BACKTRACE_SUPPORTS_DATA test5 (); #endif +#endif exit (failures ? EXIT_FAILURE : EXIT_SUCCESS); } diff --git a/libbacktrace/configure b/libbacktrace/configure index fa81659..19418c9 100755 --- a/libbacktrace/configure +++ b/libbacktrace/configure @@
[Bug ada/66242] Front-end error if exception propagation disabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66242 --- Comment #1 from simon at pushface dot org --- Created attachment 35588 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35588action=edit Suggested patch
Re: [gomp4] Vector-single predication
On Thu, 21 May 2015 13:57:00 +0200 Jakub Jelinek ja...@redhat.com wrote: On Thu, May 21, 2015 at 01:42:11PM +0200, Bernd Schmidt wrote: This uses the patch I committed yesterday which introduces warp broadcasts to implement the vector-single predication needed for OpenACC. Outside a loop with vector parallelism, only one of the threads representing a vector must execute, the others follow along. So we skip the real work in each basic block for the inactive threads, then broadcast the direction to take in the control flow graph from the active one, and jump as a group. This will get extended with similar functionality for worker-single. Julian is working on some patches on top of that to ensure the later optimizers don't destroy the control flow - we really need the threads to reconverge and perform the broadcast/jump in lockstep. Committed on gomp-4_0-branch. What do you do with function calls? Do you call them just in the (tid.x 31) == 0 threads (then they can't use vectorization), or for all threads (then it is an ABI change, they would need to know whether they are called this way and depending on that handle it similarly (skip all the real work, except for function calls, for (tid.x 31) != 0, unless it is a vectorized region). Or is OpenACC restricting this to statements in the constructs directly (rather than anywhere in the region)? OpenACC handles function calls specially (calling them routines -- of varying sorts, gang, worker, vector or seq, affecting where they can be invoked from). The plan is that all threads will call such routines -- and then some threads will be neutered as appropriate within the routines themselves, as appropriate. That's not actually implemented yet, though. Julian
Re: [PATCH, PR target/65103, 2/3] Propagate address constants into loops for i386
Ping 2015-05-05 14:05 GMT+03:00 Ilya Enkovich enkovich@gmail.com: 2015-04-21 8:52 GMT+03:00 Jeff Law l...@redhat.com: On 04/17/2015 02:34 AM, Ilya Enkovich wrote: On 15 Apr 14:07, Ilya Enkovich wrote: 2015-04-14 8:22 GMT+03:00 Jeff Law l...@redhat.com: On 03/15/2015 02:30 PM, Richard Sandiford wrote: Ilya Enkovich enkovich@gmail.com writes: This patch allows propagation of loop invariants for i386 if propagated value is a constant to be used in address operand. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk or stage 1? Is it necessary for this to be a target hook? The concept doesn't seem particularly target-specific. We should only propagate into the address if the new cost is no greater than the old cost, but if the address meets that condition and if propagating at this point in the pipeline is a win on x86, then wouldn't it be a win for other targets too? I agree with Richard here. I can't see a strong reason why this should be a target hook. Perhaps part of the issue here is the address costing metrics may not have enough context to make good decisions. In which case what context do they need? At this point I don't insist on a target hook. The main reasoning was to not affect other targets. If we extend propagation for non constant values different aspects may appear. E.g. possible register pressure changes may significantly affect ia32. I just wanted to have an instrument to play with a propagation on x86 not affecting other targets. I don't have an opportunity to test possible performance implications on non-x86 targets. Don't expect (significant) regressions there but who knows... I'll remove the hook from this patch. Will probably introduce it later if some target specific cases are found. Thanks, Ilya Jeff Here is a version with no hook. Bootstrapped and tested on x86_64-unknown-linux-gnu. Is it OK for trunk? Thanks, Ilya -- gcc/ 2015-04-17 Ilya Enkovich ilya.enkov...@intel.com PR target/65103 * fwprop.c (forward_propagate_into): Propagate loop invariants if a target says so. gcc/testsuite/ 2015-04-17 Ilya Enkovich ilya.enkov...@intel.com PR target/65103 * gcc.target/i386/pr65103-2.c: New. It seems to me there's a key piece missing here -- metrics. When is this profitable, when is it not profitable. Just blindly undoing LICM seems wrong here. The first thought is to look at register pressure through the loop. I thought we had some infrastructure for this kind of query available. It'd probably be wise to re-use it. In fact, one might reasonably ask if LICM should have hoisted the expression to start with. I'd also think the cost of the constant may come into play here. A really cheap constant probably should not have been hoisted by LICM to start with -- but the code may have been written in such a way that some low cost constants are pulled out as loop invariants at the source level. So this isn't strictly an issue of un-doing bad LICM So I think to go forward we need to be working on solving the when is this a profitable transformation to make. This patch doesn't force propagation. The patch just allows propagation and regular fwprop cost estimation is used to compute if this is profitable. For i386 I don't see cases when we shouldn't propagate. We remove instruction, reduce register pressure and having constant in memory operand is free which is reflected in address_cost hook. Ilya jeff
[Bug c++/66239] Unoptimized sqrt(float or double) returns wrong values for ARM Cortex-A8 -mfloat-abi=[soft,softfp]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66239 --- Comment #1 from Maciej Andrzejewski maciej.andrzejewski at data dot pl --- It is getting even more interesting. I have disassabled 4 binaries compiled with options: 1) -mfloat-abi=softfp 2) -mfloat-abi=softfp -O 3) -mfloat-abi=hard 4) -mfloat-abi=hard -O and from what I understand if we turn ON the optimization the FPU is turned OFF! I dont see in assembler that FPU s** registers are used in those two cases where optimization is turned on: -- DISASSAMBLE OPTION 1 -- 00010570 main: 10570: e92d4800push{fp, lr} 10574: e28db004add fp, sp, #4 10578: e24dd040sub sp, sp, #64 ; 0x40 1057c: e3032333movwr2, #13107 ; 0x 10580: e3432333movtr2, #13107 ; 0x 10584: e303movwr3, #13107 ; 0x 10588: e3443022movtr3, #16418 ; 0x4022 1058c: e14b20fcstrdr2, [fp, #-12] 10590: ed5b0b03vldrd16, [fp, #-12] 10594: eef77be0vcvt.f32.f64s15, d16 10598: ee170a90vmovr0, s15 1059c: eba1bl 10428 sqrtf@plt 105a0: ee070a90vmovs15, r0 105a4: eef70ae7vcvt.f64.f32d16, s15 105a8: ed4b0b05vstrd16, [fp, #-20] ; 0xffec 105ac: e30006ccmovwr0, #1740 ; 0x6cc 105b0: e341movtr0, #1 105b4: e14b21d4ldrdr2, [fp, #-20] ; 0xffec 105b8: eba0bl 10440 printf@plt 105bc: e309399amovwr3, #39322 ; 0x999a 105c0: e3443111movtr3, #16657 ; 0x4111 105c4: e50b3018str r3, [fp, #-24] ; 0xffe8 105c8: e51b0018ldr r0, [fp, #-24] ; 0xffe8 105cc: eb95bl 10428 sqrtf@plt 105d0: ee070a90vmovs15, r0 105d4: eef70ae7vcvt.f64.f32d16, s15 105d8: ed4b0b09vstrd16, [fp, #-36] ; 0xffdc 105dc: e30006ccmovwr0, #1740 ; 0x6cc 105e0: e341movtr0, #1 105e4: e14b22d4ldrdr2, [fp, #-36] ; 0xffdc 105e8: eb94bl 10440 printf@plt 105ec: e3032333movwr2, #13107 ; 0x 105f0: e3432333movtr2, #13107 ; 0x 105f4: e303movwr3, #13107 ; 0x 105f8: e3443022movtr3, #16418 ; 0x4022 105fc: e14b22fcstrdr2, [fp, #-44] ; 0xffd4 10600: e14b02dcldrdr0, [fp, #-44] ; 0xffd4 10604: eb8abl 10434 sqrt@plt 10608: e14b03f4strdr0, [fp, #-52] ; 0xffcc 1060c: e30006ccmovwr0, #1740 ; 0x6cc 10610: e341movtr0, #1 10614: e14b23d4ldrdr2, [fp, #-52] ; 0xffcc 10618: eb88bl 10440 printf@plt 1061c: e309399amovwr3, #39322 ; 0x999a 10620: e3443111movtr3, #16657 ; 0x4111 10624: e50b3038str r3, [fp, #-56] ; 0xffc8 10628: ed5b7a0evldrs15, [fp, #-56] ; 0xffc8 1062c: eef70ae7vcvt.f64.f32d16, s15 10630: ec510b30vmovr0, r1, d16 10634: eb7ebl 10434 sqrt@plt 10638: e14b04f4strdr0, [fp, #-68] ; 0xffbc 1063c: e30006ccmovwr0, #1740 ; 0x6cc 10640: e341movtr0, #1 10644: e14b24d4ldrdr2, [fp, #-68] ; 0xffbc 10648: eb7cbl 10440 printf@plt 1064c: e3a03000mov r3, #0 10650: e1a3mov r0, r3 10654: e24bd004sub sp, fp, #4 10658: e8bd8800pop {fp, pc} -- DISASSAMBLE OPTION 1 -- -- DISASSAMBLE OPTION 2 -- 000104f4 main: 104f4: e92d40d0push{r4, r6, r7, lr} 104f8: e30045d4movwr4, #1492 ; 0x5d4 104fc: e3404001movtr4, #1 10500: e3a06000mov r6, #0 10504: e302720amovwr7, #8714 ; 0x220a 10508: e3447008movtr7, #16392 ; 0x4008 1050c: e1a4mov r0, r4 10510: e1a02006mov r2, r6 10514: e1a03007mov r3, r7 10518: eba9bl 103c4 printf@plt 1051c: e1a4mov r0, r4 10520: e1a02006mov r2, r6 10524: e1a03007mov r3, r7 10528: eba5bl 103c4 printf@plt 1052c: e1a4mov r0, r4 10530: e30f2d38movwr2, #64824 ;
[Bug c/66240] New: RFE: extend -falign-xyz syntax
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66240 Bug ID: 66240 Summary: RFE: extend -falign-xyz syntax Product: gcc Version: 5.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: vda.linux at googlemail dot com Target Milestone: --- Experimentally, compilation with -O2 -falign-functions=17 -falign-loops=17 -falign-jumps=17 -falign-labels=17 results in the following: - functions are aligned using .p2align 5,,16 asm directive - loops/jumps/labels are aligned using .p2align 5 -Os -falign-functions=17 -falign-loops=17 -falign-jumps=17 -falign-labels=17 results in the following: - functions are not aligned - loops/jumps/labels are aligned using .p2align 5 Can this be improved so that in all cases, .p2align 5,,16 is used? Shouldn't be that hard... Next step (what this RFE is all about). -falign-functions=N is too simplistic. Ingo Molnar ran some tests and it looks on latest x86 CPUs, 64-byte alignment runs fastest (he tried many other possibilites). However, developers are less than thrilled by the idea of a slam-dunk 64-byte aligning everything. Too much waste: On 05/20/2015 02:47 AM, Linus Torvalds wrote: At the same time, I have to admit that I abhor a 64-byte function alignment, when we have a fair number of functions that are (much) smaller than that. Is there some way to get gcc to take the size of the function into account? Because aligning a 16-byte or 32-byte function on a 64-byte alignment is just criminally nasty and wasteful. I propose the following: align function to 64-byte boundaries *IF* this does not introduce huge amount of padding. GNU as already has support for this: .align N1,FILL,N3 The third expression is also absolute, and is also optional. If it is present, it is the maximum number of bytes that should be skipped by this alignment directive. So, what we want is to put something like .align 64,,7 before every function. 98% of functions in typical linux kernel have first instruction 7 or fewer bytes long. Thus, with .align 64,,7, calling any function will at a minimum be able to fetch one insn in one L1 read, not two. And this would be acheved with only ~3.5 bytes per function wasted to padding on average, whereas .align 64 would waste 31 byte on average. Please extend -falign-foo=N syntax to, say, -falign-foo=N,M, which generates .align M,,N-1 or equivalent.
Re: [PATCH 5/7] always define HAVE_conditional_move
On 05/20/2015 08:09 PM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders tbsaunde+...@tbsaunde.org gcc/ChangeLog: 2015-05-20 Trevor Saunders tbsaunde+...@tbsaunde.org * genconfig.c (main): Always define HAVE_conditional_move. * *.c: Don't check if HAVE_conditional_move is defined. Again, you're hitting just a handful of files, if you could go ahead and list them it'd be appreciated. OK for the trunk. jeff
Re: [PATCH 2/7] remove most ifdef STACK_GROWS_DOWNWARD
On 05/20/2015 08:09 PM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders tbsaunde+...@tbsaunde.org gcc/c-family/ChangeLog: 2015-05-20 Trevor Saunders tbsaunde+...@tbsaunde.org * c-cppbuiltin.c (c_cpp_builtins): Use if instead of #if with STACK_GROWS_DOWNWARD. gcc/ChangeLog: 2015-05-20 Trevor Saunders tbsaunde+...@tbsaunde.org * *.c: Use if instead of preprocessor checks with STACK_GROWS_DOWNWARD. --- gcc/ChangeLog | 5 gcc/builtins.c | 30 +++ gcc/c-family/ChangeLog | 5 gcc/c-family/c-cppbuiltin.c | 5 ++-- gcc/dwarf2cfi.c | 12 +- gcc/explow.c| 33 -- gcc/expr.c | 58 +++-- gcc/recog.c | 8 ++- gcc/sched-deps.c| 9 --- 9 files changed, 78 insertions(+), 87 deletions(-) OK with the usual request to list filenames in the ChangeLogs. jeff
Re: [PATCH 1/7] always define STACK_GROWS_DOWNWARD
On 05/20/2015 08:09 PM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders tbsaunde+...@tbsaunde.org gcc/c-family/ChangeLog: 2015-05-20 Trevor Saunders tbsaunde+...@tbsaunde.org * c-cppbuiltin.c (c_cpp_builtins): Check the value of STACK_GROWS_DOWNWARD rather than if it is defined. gcc/ChangeLog: 2015-05-20 Trevor Saunders tbsaunde+...@tbsaunde.org * *.c: Check the value of STACK_GROWS_DOWNWARD rather than if it is defined. * config/**/*.h: Define STACK_GROWS_DOWNWARD to an integer. * defaults.h: Provide default for STACK_GROWS_DOWNWARD. --- gcc/ChangeLog | 7 +++ gcc/builtins.c | 6 +++--- gcc/c-family/ChangeLog | 5 + gcc/c-family/c-cppbuiltin.c| 2 +- gcc/calls.c| 8 gcc/combine-stack-adj.c| 8 gcc/config/alpha/alpha.h | 2 +- gcc/config/arc/arc.h | 2 +- gcc/config/avr/avr.h | 2 +- gcc/config/bfin/bfin.h | 2 +- gcc/config/c6x/c6x.h | 2 +- gcc/config/cr16/cr16.h | 2 +- gcc/config/cris/cris.h | 2 +- gcc/config/epiphany/epiphany.h | 2 +- gcc/config/h8300/h8300.h | 2 +- gcc/config/i386/i386.h | 2 +- gcc/config/iq2000/iq2000.h | 2 +- gcc/config/m32r/m32r.h | 2 +- gcc/config/mcore/mcore.h | 2 +- gcc/config/microblaze/microblaze.h | 2 +- gcc/config/mips/mips.h | 2 +- gcc/config/mmix/mmix.h | 2 +- gcc/config/mn10300/mn10300.h | 2 +- gcc/config/moxie/moxie.h | 2 +- gcc/config/nds32/nds32.h | 2 +- gcc/config/nios2/nios2.h | 2 +- gcc/config/nvptx/nvptx.h | 2 +- gcc/config/pdp11/pdp11.h | 2 +- gcc/config/rs6000/rs6000.h | 2 +- gcc/config/s390/s390.h | 2 +- gcc/config/sh/sh.h | 2 +- gcc/config/sparc/sparc.h | 2 +- gcc/config/spu/spu.h | 2 +- gcc/config/tilegx/tilegx.h | 2 +- gcc/config/tilepro/tilepro.h | 2 +- gcc/config/v850/v850.h | 2 +- gcc/config/vax/vax.h | 2 +- gcc/config/xtensa/xtensa.h | 2 +- gcc/defaults.h | 4 gcc/dwarf2cfi.c| 4 ++-- gcc/explow.c | 10 +- gcc/expr.c | 20 gcc/ira-color.c| 8 gcc/lower-subreg.c | 7 --- gcc/lra-spills.c | 8 gcc/recog.c| 6 +++--- gcc/sched-deps.c | 2 +- 47 files changed, 71 insertions(+), 98 deletions(-) OK. Not going to require each filename to be listed in the ChangeLog :-) Thanks for taking care of this stuff! Jeff
Re: [RFC] Combine related fail of gcc.target/powerpc/ti_math1.c
On Thu, May 21, 2015 at 08:06:04PM +0930, Alan Modra wrote: FAIL: gcc.target/powerpc/ti_math1.c scan-assembler-times adde 1 is seen on powerpc64le-linux since somewhere between revision 218587 and 218616. See https://gcc.gnu.org/ml/gcc-testresults/2014-12/msg01287.html and https://gcc.gnu.org/ml/gcc-testresults/2014-12/msg01325.html A regression hunt fingers one of Segher's 2014-12-10 patches to the rs6000 backend, git commit 0f1bedb4 or svn rev 218595. It doesn't trigger on big-endian; what is different? The order of tried combinations I suppose, because the loads are swapped? Segher might argue that generated code is better after this commit, and I'd agree that his change is a good one in general, but even so it would be nice to generate the ideal code. Yes (to all of it). It was a big rip-up, left better generated code than we had before, and simplified the rs6000 backend code. Tuning it again now can only make things better :-) Curiously, the ideal code is generated at -O1, but we regress at -O2.. Skipping all the way to the end of your mail, that's because of flag_expensive_optimizations (if nothing else). before after ideal (-O1) add_128: add_128:add_128: ld 10,0(3) ld 9,0(3) ld 9,0(3) ld 11,8(3) ld 10,8(3) ld 10,8(3) addc 8,4,10 addc 3,4,9 addc 3,4,9 adde 9,5,11 addze 5,5 adde 4,5,10 mr 3,8 add 4,5,10 blr mr 4,9 blr blr I went looking into where the addze appeared, and found combine. That wasn't a big surprise I hope :-) Trying 18, 9 - 24: Failed to match this instruction: (set (reg:DI 4 4 [+8 ]) (plus:DI (plus:DI (reg:DI 5 5 [ val+8 ]) (reg:DI 76 ca)) (reg:DI 169 [+8 ]))) For some reason it has the CA reg not last. I think we should add to the canonicalisation rules so that fixed regs sort after other regs. That requires a lot of testing. For even slightly less trivial code, combine usually tries something in the correct order as well, btw, which is why the current code works as well as it does. Successfully matched this instruction: (set (reg:DI 167 [ D.2366+8 ]) (plus:DI (reg:DI 5 5 [ val+8 ]) (reg:DI 76 ca))) Successfully matched this instruction: (set (reg:DI 4 4 [+8 ]) (plus:DI (reg:DI 167 [ D.2366+8 ]) (reg:DI 169 [+8 ]))) allowing combination of insns 18, 9 and 24 original costs 4 + 8 + 4 = 16 replacement costs 4 + 4 = 8 Still need to fix the costs as well (but they work as-is; well enough that is). Here are the three insns involved, sans source line numbers and notes. (insn 18 17 4 2 (set (reg:DI 165 [ val+8 ]) (reg:DI 5 5 [ val+8 ])) {*movdi_internal64}) ... (insn 9 8 23 2 (parallel [ (set (reg:DI 167 [ D.2366+8 ]) (plus:DI (plus:DI (reg:DI 165 [ val+8 ]) (reg:DI 169 [+8 ])) (reg:DI 76 ca))) (clobber (reg:DI 76 ca)) ]) {*adddi3_carry_in_internal}) ... (insn 24 23 15 2 (set (reg:DI 4 4 [+8 ]) (reg:DI 167 [ D.2366+8 ])) {*movdi_internal64}) So, a move copying an argument register to a pseudo, one insn from the body of the function, and a move copying a pseudo to a result register. The thought I had was: It is really combine's business to look at copies from/to ABI mandated hard registers? Isn't removing the copies something that register allocation can do better? Yes, a well-known problem, and one I intended to fix for GCC 6. If so, then combine is doing unnecessary work. Not only that, it pessimises generated code (as you see here; but more often it prevents optimal register allocation because it already has put a hard reg somewhere). As a quick hack, I tried the following. Index: gcc/combine.c === --- gcc/combine.c (revision 223431) +++ gcc/combine.c (working copy) @@ -1281,6 +1281,16 @@ combine_instructions (rtx_insn *f, unsigned int nr if (!NONDEBUG_INSN_P (insn)) continue; + if (this_basic_block == EXIT_BLOCK_PTR_FOR_FN (cfun)-prev_bb) Are these copies guaranteed to (still) be in this basic block, after the passes before combine? Did those passes do anything to prevent moving it? I'm asking because it would be good to use the same conditions in that case. + { + rtx set = single_set (insn); + if (set +REG_P (SET_DEST (set)) +HARD_REGISTER_P (SET_DEST (set)) +REG_P (SET_SRC (set))) + continue; + } + while (last_combined_insn last_combined_insn-deleted ()) last_combined_insn = PREV_INSN (last_combined_insn); This cures the