Re: [C++ Patch] PR 62232
Hi Paolo, On Wed, 17 Sep 2014, Paolo Carlini wrote: clang recently, in 3.5.0 if I remember correctly, stopped -Wnon-virtual-dtor warning for final classes. I think it makes sense to do the same. mind adding a note to wwwdocs/htdocs/gcc-5/changes.html ? Thanks, Gerald
Re: [patch i386]: Sibcall tail-call improvement and partial fix PR/60104
On 09/18/14 15:26, Kai Tietz wrote: Hi, it isn't true that I didn't replied to Iant. I did this on IRC. As I explained there already, this hunk about thunks is more consolidation of code-paths in that function, and not really part of a feature. As this code-path isn't prominent mark being Darwin-code - and please don't take me wrong, but it seems to be until now the only target reporting this issues - and therefore I strongly see the issue to be solved for Darwin. I don't see that this changes needs an additional testcase demonstration on a already regression-tested target that it doesn't break ... This is somehow like asking for gcc-testcase demostration that gcc's darwin target isn't responsible for earth's warming ... I found this a bit difficult to parse, so I'm going to try and summarize, please tell me if I've got it right or wrong. The code in question is not explicitly marked as being Darwin specific; however, to date we've only managed to exercise it on Darwin. Therefore, any fix is likely to be fairly specific to Darwin's unique characteristics. Furthermore, Kai believes that any new test would be redundant with the existing tests that are currently failing on Darwin. Is that a correct summary? Jeff
Re: [PATCH 1/5] Allow *_HARD_REG_SET arguments to be const
On 09/18/14 04:09, Richard Sandiford wrote: Patch 4 needs to pass a const HARD_REG_SET to AND/COPY_HARD_REG_SET. This patch allows that for all intent-in arguments. gcc/ * hard-reg-set.h (COPY_HARD_REG_SET, COMPL_HARD_REG_SET) (AND_HARD_REG_SET, AND_COMPL_HARD_REG_SET, IOR_HARD_REG_SET) (IOR_COMPL_HARD_REG_SET): Allow the from set to be constant. Seems like a good cleanup in and of itself. Fine for the trunk independent of the other changes. jeff
Re: [PATCH 0/5] Fix handling of word subregs of wide registers
On 09/18/14 04:07, Richard Sandiford wrote: This series is a cleaned-up version of: https://gcc.gnu.org/ml/gcc/2014-03/msg00163.html The underlying problem is that the semantics of subregs depend on the word size. You can't have a subreg for byte 2 of a 4-byte word, say, but you can have a subreg for word 2 of a 4-word value (as well as lowpart subregs of that word, etc.). This causes problems when an architecture has wider-than-word registers, since the addressability of a word can then depend on which register class is used. The register allocators need to fix up cases where a subreg turns out to be invalid for a particular class. This is really an extension of what we need to do for CANNOT_CHANGE_MODE_CLASS. Tested on x86_64-linux-gnu, powerpc64-linux-gnu and aarch64_be-elf. I thought we fixed these problems long ago with the change to subreg_byte?!? Clearly I missed something. jeff
Re: [PATCH 2/5] Tweak subreg_get_info documentation
On 09/18/14 04:10, Richard Sandiford wrote: Try to clarify what subreg_get_info does and doesn't check. gcc/ * rtl.h (subreg_info): Expand commentary * rtlanal.c (subreg_get_info): Likewise. OK independently of the other patches as well. jeff
Re: [PATCH 3/5] Use simplify_subreg_regno in combine.c:subst
On 09/18/14 04:13, Richard Sandiford wrote: combine.c:subst should refuse to substitute a hard register into a subreg if the new subreg would not be simplified to a simple hard register, since the result would have to be reloaded. This is more for optimisation than correctness, since in theory the RA should be able to fix up any unsimplified subregs. gcc/ * combine.c (subst): Use simplify_subreg_regno rather than REG_CANNOT_CHANGE_MODE_P to detect invalid mode changes. OK independent of other changes. jeff
Re: Ping: [RFA 1/2]: Don't ignore target_header_dir when deciding inhibit_libc
On 09/12/14 11:49, Hans-Peter Nilsson wrote: Ping! http://gcc.gnu.org/ml/gcc-patches/2014-09/msg00402.html From: Hans-Peter Nilsson h...@axis.com Date: Thu, 4 Sep 2014 23:40:40 +0200 The directory at $target_header_dir is already inspected in gcc/configure, for e.g. glibc version and stack protector support, but not for setting inhibit_libc. This is just inconsistent and the obvious resolution to me is to inhibit inhibit_libc when a target *does* have its own set of headers, to quote the comment above the inhibit_libc setting. There is nothing in the build log for make all-gcc that shows a difference with/without --with-headers, if headers are actually present anyway! It may seem that libgcc/configure.ac would be the appropriate place to patch and test, but it is gcc/configure.ac which tests various things about target headers and makes the inhibit_libc decision, exporting it through the generated obj/gcc/libgcc.mvars that is included in libgcc/Makefile. Tested before/after by make all-gcc on native x86_64-linux (*a) and seeing it still set (for the peace of most users) in gcc/Makefile, and cross to mipsel-linux make all-gcc with/without (*b,c) a pre-installed set of headers just implied by --prefix and --target to observe the intended difference and the same with (*d) --with-sysroot (but no headers at the sysroot) and (*e) --with-build-sysroot and both (*f) (note that --with-build-sysroot=... without --with-sysroot also got inhibit_libc) to observe no change for the --with-sysroot one (still no inhibit_libc). The same with --with-headers (*g). Also, I checked that nothing other than the inhibit_libc code uses target_header_dir or sets the used variables in the block of code involved in the move. Ok to commit? gcc: * configure.ac (target_header_dir): Move block defining this to before the block setting inhibit_libc. (inhibit_libc): When considering $with_headers, just check it it's explicitly no. If not, also check if $target_header_dir/stdio.h is present. If not, set inhibit_libc=true. * configure: Regenerate. OK. jeff
Re: [RFA 2/2]: --enable-explicit-exception-frame-registration compatibility option
On 09/12/14 11:51, Hans-Peter Nilsson wrote: Ping! http://gcc.gnu.org/ml/gcc-patches/2014-09/msg00403.html From: Hans-Peter Nilsson h...@axis.com Date: Thu, 4 Sep 2014 23:42:28 +0200 This adds an option to allow programs and libraries built *without* inhibit_libc to stay compatible with system libraries (really: libgcc_s.so.1) built *with* inhibit_libc, at the cost of the registration. As mentioned, that's a one-way compatibility barrier. While it's nice to avoid the overhead of a function call at DSO/program initialization time, using eh-registry isn't that much of overhead until an exception is thrown: most of the execution-time overhead will come from the additional symbol-table-entries due to the registry calls (typically 2 in all libraries and programs, as weak references) and calls to thread-locking as the (main) part of its work-load. Besides the symbol-table-entries the footprint size is typically the code for if (__register_frame_info) __register_frame_info (__EH_FRAME_BEGIN__, object); and the corresponding deregistration and the static object struct (6 or 7 pointers). (As mentioned, the library-part of the eh-registry support is present either way.) So, a low-cost compatibility path is to always call __register_frame_info, despite favorable conditions for phdr usage. Here's a patch to optionally do that, controlled by a configure-time option tentatively called --enable-explicit-exception-frame-registration (subject to bikeshedding if only for the length). Note that there is a cost when an exception *is* thrown, the dreaded sorting of FDE:s. There seems to be some obvious room for improvement though, as the same information is available *without* sorting through the PT_GNU_EH_FRAME header entry for the same file. Tested with no regressions after fixing g++.old-deja/g++.eh/badalloc1.C (see http://gcc.gnu.org/ml/gcc-patches/2014-09/msg00115.html) for native x86_64-linux before compared to with patch/with patch and --enable-explicit-exception-frame-registration/. Ok to commit? libgcc: * crtstuff.c [EH_FRAME_SECTION_NAME !USE_PT_GNU_EH_FRAME]: Sanity-check USE_EH_FRAME_REGISTRY_ALWAYS against EH_FRAME_SECTION_NAME, emit error if unsane. (USE_EH_FRAME_REGISTRY): Let USE_EH_FRAME_REGISTRY_ALWAYS override USE_PT_GNU_EH_FRAME. * Makefile.in (FORCE_EXPLICIT_EH_REGISTRY): New variable for substituted force_explicit_eh_registry. (CRTSTUFF_CFLAGS): Add FORCE_EXPLICIT_EH_REGISTRY. * configure.ac (explicit-exception-frame-registration): New AC_ARG_ENABLE. * configure: Regenerate. OK. jeff
RE: [PATCH] Fix PR63152
The following fixes PR63152 zeroing the data field only for allocatables, not pointers. The benefit of the patch is a small speedup, and it avoids that code starts to rely on behavior that is undefined in the standard. With this patch, something like INTEGER, DIMENSION(:), POINTER :: foo IF (ASSOCIATED(foo)) ... will be detected by valgrind as undefined behavior. The code you touch is exercised in four different cases, as far as I can see from the assert earlier in the function: gcc_assert (sym-attr.pointer || sym-attr.allocatable || sym_has_alloc_comp || has_finalizer); So do we want to test (sym-attr.allocatable), or (!sym-attr.pointer)? Or, asked another way: should we NULLIFY in the case of sym_has_alloc_comp || has_finalizer? Hi FX, thanks for your good question. I think it is equivalent, as it seems that GFC_DESCRIPTOR_TYPE_P (type) implies either sym-attr.allocatable or sym-attr.pointer. To check, I rank a check-fortran with the explicit patch below, and this made no difference. Code gen for a number of additional testcases involving alloc_comp and finalizers looked good as well. So, I think the original patch is still fine. Joost Index: trans-array.c === --- trans-array.c (revision 215373) +++ trans-array.c (working copy) @@ -8647,9 +8647,18 @@ gfc_trans_deferred_array (gfc_symbol * s type = TREE_TYPE (descriptor); } - /* NULLIFY the data pointer. */ + /* NULLIFY the data pointer, for non-saved allocatables. */ if (GFC_DESCRIPTOR_TYPE_P (type) !sym-attr.save) -gfc_conv_descriptor_data_set (init, descriptor, null_pointer_node); +{ + if (sym-attr.allocatable) +{ + gfc_conv_descriptor_data_set (init, descriptor, null_pointer_node); +} + else +{ + if (!sym-attr.pointer) gcc_unreachable (); +} +} gfc_restore_backend_locus (loc); gfc_init_block (cleanup);
Re: [PATCH 0/5] Fix handling of word subregs of wide registers
Jeff Law l...@redhat.com writes: On 09/18/14 04:07, Richard Sandiford wrote: This series is a cleaned-up version of: https://gcc.gnu.org/ml/gcc/2014-03/msg00163.html The underlying problem is that the semantics of subregs depend on the word size. You can't have a subreg for byte 2 of a 4-byte word, say, but you can have a subreg for word 2 of a 4-word value (as well as lowpart subregs of that word, etc.). This causes problems when an architecture has wider-than-word registers, since the addressability of a word can then depend on which register class is used. The register allocators need to fix up cases where a subreg turns out to be invalid for a particular class. This is really an extension of what we need to do for CANNOT_CHANGE_MODE_CLASS. Tested on x86_64-linux-gnu, powerpc64-linux-gnu and aarch64_be-elf. I thought we fixed these problems long ago with the change to subreg_byte?!? No, that was fixing something else. (I'm just about old enough to remember that too!) The problem here is that (say): (subreg:SI (reg:DI X) 4) is independently addressable on little-endian AArch32 if X assigned to a GPR, but not if X is assigned to a vector register. We need to allow these kinds of subreg on pseudos in order to decompose multiword arithmetic. It's then up to the RA to realise that a reload would be needed if X were assigned to a vector register, since the upper half of a vector register cannot be independently accessed. Note that you could write this example even with the old word-style offsets and IIRC the effect would have been the same. Thanks, Richard
Re: [PATCH, i386, Pointer Bounds Checker 31/x] Pointer Bounds Checker builtins for i386 target
On 18 Sep 19:33, Uros Bizjak wrote: On Thu, Sep 18, 2014 at 3:47 PM, Ilya Enkovich enkovich@gmail.com wrote: Thanks for your comments. Below is a fixed verison. Ilya -- OK with a few nits below. Thanks, Uros. + +case IX86_BUILTIN_BNDRET: + arg0 = CALL_EXPR_ARG (exp, 0); + gcc_assert (TREE_CODE (arg0) == SSA_NAME); + target = chkp_get_rtl_bounds (arg0); Please add vertical space here ... + /* If no bounds were specified for returned value, +then use INIT bounds. It usually happens when +some built-in function is expanded. */ + if (!target) + { + rtx t1 = gen_reg_rtx (Pmode); + rtx t2 = gen_reg_rtx (Pmode); + target = gen_reg_rtx (BNDmode); + emit_move_insn (t1, const0_rtx); + emit_move_insn (t2, constm1_rtx); + emit_insn (BNDmode == BND64mode +? gen_bnd64_mk (target, t1, t2) +: gen_bnd32_mk (target, t1, t2)); + } ... and here. + gcc_assert (target REG_P (target)); + return target; + + + /* Generate mem expression to access UB. */ + hmem = adjust_address (mem, Pmode, GET_MODE_SIZE (Pmode)); Vertical space here ... + /* We need to inverse all bits of UB. */ + res = expand_simple_unop (Pmode, NOT, hmem, target, 1); ... and here. + if (res != target) + emit_move_insn (target, res); + Thank you for detailed review of this patch. Here is a final version. Ilya -- 2014-09-19 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386-builtin-types.def (BND): New. (ULONG): New. (BND_FTYPE_PCVOID_ULONG): New. (VOID_FTYPE_BND_PCVOID): New. (VOID_FTYPE_PCVOID_PCVOID_BND): New. (BND_FTYPE_PCVOID_PCVOID): New. (BND_FTYPE_PCVOID): New. (BND_FTYPE_BND_BND): New. (PVOID_FTYPE_PVOID_PVOID_ULONG): New. (PVOID_FTYPE_PCVOID_BND_ULONG): New. (ULONG_FTYPE_VOID): New. (PVOID_FTYPE_BND): New. * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h. (ix86_builtins): Add IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (builtin_isa): Add leaf_p and nothrow_p fields. (def_builtin): Initialize leaf_p and nothrow_p. (ix86_add_new_builtins): Handle leaf_p and nothrow_p flags. (bdesc_mpx): New. (bdesc_mpx_const): New. (ix86_init_mpx_builtins): New. (ix86_init_builtins): Call ix86_init_mpx_builtins. (ix86_emit_cmove): New. (ix86_emit_move_max): New. (ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def index 35c0035..989297a 100644 --- a/gcc/config/i386/i386-builtin-types.def +++ b/gcc/config/i386/i386-builtin-types.def @@ -47,6 +47,7 @@ DEF_PRIMITIVE_TYPE (UCHAR, unsigned_char_type_node) DEF_PRIMITIVE_TYPE (QI, char_type_node) DEF_PRIMITIVE_TYPE (HI, intHI_type_node) DEF_PRIMITIVE_TYPE (SI, intSI_type_node) +DEF_PRIMITIVE_TYPE (BND, pointer_bounds_type_node) # ??? Logically this should be intDI_type_node, but that maps to long # with 64-bit, and that's not how the emmintrin.h is written. Again, # changing this would change name mangling. @@ -60,6 +61,7 @@ DEF_PRIMITIVE_TYPE (USHORT, short_unsigned_type_node) DEF_PRIMITIVE_TYPE (INT, integer_type_node) DEF_PRIMITIVE_TYPE (UINT, unsigned_type_node) DEF_PRIMITIVE_TYPE (UNSIGNED, unsigned_type_node) +DEF_PRIMITIVE_TYPE (ULONG, long_unsigned_type_node) DEF_PRIMITIVE_TYPE (LONGLONG, long_long_integer_type_node) DEF_PRIMITIVE_TYPE (ULONGLONG, long_long_unsigned_type_node) DEF_PRIMITIVE_TYPE (UINT8, unsigned_char_type_node) @@ -806,3 +808,15 @@ DEF_FUNCTION_TYPE_ALIAS (V2DI_FTYPE_V2DI_V2DI, TF) DEF_FUNCTION_TYPE_ALIAS (V4SF_FTYPE_V4SF_V4SF, TF) DEF_FUNCTION_TYPE_ALIAS (V4SI_FTYPE_V4SI_V4SI, TF) DEF_FUNCTION_TYPE_ALIAS (V8HI_FTYPE_V8HI_V8HI, TF) + +# MPX builtins +DEF_FUNCTION_TYPE (BND, PCVOID, ULONG) +DEF_FUNCTION_TYPE (VOID, PCVOID, BND) +DEF_FUNCTION_TYPE (VOID, PCVOID, BND, PCVOID) +DEF_FUNCTION_TYPE (BND, PCVOID, PCVOID) +DEF_FUNCTION_TYPE (BND, PCVOID) +DEF_FUNCTION_TYPE (BND, BND, BND) +DEF_FUNCTION_TYPE (PVOID, PVOID, PVOID, ULONG) +DEF_FUNCTION_TYPE (PVOID, PCVOID, BND, ULONG) +DEF_FUNCTION_TYPE (ULONG, VOID) +DEF_FUNCTION_TYPE (PVOID, BND) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index
Re: [patch i386]: Sibcall tail-call improvement and partial fix PR/60104
Hello Jeff, all, On 19 Sep 2014, at 05:14, Jeff Law wrote: On 09/18/14 16:20, Iain Sandoe wrote: 1. There has been a change made to make the upper path like the lower path (as you said on IRC). - apparently (from our conversation) you don't expect this to be a general optimisation improvement. - but it doesn't seem to be either obvious or a cleanup to me - if you are asserting that it is a cleanup, then some explanation is in order. Seems reasonable. 2. Apparently you don't think it is necessary to have any testcase to demonstrate that the new code is working? - perhaps you are asserting that the code is correct by inspection? A testcase would be nice to add and I'd strongly prefer to have one in the suite -- even if it's Darwin specific. I have not succeeded in getting this code-path to trigger on Linux** - on irc, IIUC, Kai was saying that he didn't expect it would trigger on either Windows or Linux? * if it's really intended to be mach-o specific then: a) it should have an if (TARGET_MACHO ...) so that it's DCE'd for other targets. b) I'd like a clear explanation of what it's supposed to do so that we can examine why it doesn't do that.. c) ..and, until we fix it it, it should be disabled or left out. * If it's not darwin-specific then surely a change in code-gen must be accompanied by some test that demonstrates it DTRT on at least one target? -- ** I put a gcc_abort() in the path and ran bootstrap and make check on x86-64-Linux without seeing any aborts from this (so currently i have no non-darwin data). My understanding is the problem is Darwin's linker (or dynamic loader?) can't handle the code we end up generating. While there may be other systems where one could show the problem due to limitations of the linker/loader on thos systems, probably the most common will be Darwin. So we probably need a Darwin specific test. This is not a ld64 or dyld issue. The fail is a compiler ICE (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61387) Kai's speculation is that there's some unspecified fault with Darwin's address legalisation, although I have pointed out that for x86_64, the darwin and linux ABIs are the same, so there's a lot of shared code (the code is triggered by a different symbol visibility - which might, in itself, be a bug but that's separate from the matter at hand). --- IMO, we need to put Darwin to one side, and have a clear explanation of what the change is supposed to achieve on a NON-Darwin system. However, I don't think Kai has a Darwin box to do the necessary testing. If you, or someone, could build a test and verify it fails without Kai's patch, then passes with Kai's patch, that'd be quite helpful. If there's a genuine Darwin problem, then we will try to fix it, of course; and Darwin folks will be happy to test prospective patches. - but I remain concerned that this is just papering over an underlying issue with the change, that is being thrown up by Darwin (by luck that it happens to trigger the code path). 3. You don't seem to think it necessary to amend the comments in the code to reflect the new functionality? Kai, can you update the comments, please? thanks for reviewing, Iain Jeff
Re: [patch i386]: Sibcall tail-call improvement and partial fix PR/60104
I found this a bit difficult to parse, so I'm going to try and summarize, please tell me if I've got it right or wrong. The code in question is not explicitly marked as being Darwin specific; however, to date we've only managed to exercise it on Darwin. Therefore, any fix is likely to be fairly specific to Darwin's unique characteristics. Furthermore, Kai believes that any new test would be redundant with the existing tests that are currently failing on Darwin. Is that a correct summary? That seems correct, yes. Something in Darwin’s handling of visibility triggers it. One more point, unanswered in what I’ve seen, is this from Iain: b) I'd like a clear explanation of what it's supposed to do so that we can examine why it doesn't do that.. c) ..and, until we fix it it, it should be disabled or left out. FX
Commit: RL78: Fix poping of registers in G10 mode
Hi Guys, I am applying the patch below to fix a problem with the RL78's popping of pushed registers in G10 mode. The problem was that the pop uses a two instruction sequence and dead code elimination was deleting the second instruction (the move of the popped value from register A to the correct register) because the popped values were never used. Cheers Nick gcc/ChangeLog 2014-09-19 Nick Clifton ni...@redhat.com * config/rl78/rl78.c (rl78_expand_epilogue): Generate a USE of the pop'ed registers so that DCE does not eliminate them. Index: gcc/config/rl78/rl78.c === RCS file: /cvs/cvsfiles/gnupro/gcc/config/rl78/rl78.c,v retrieving revision 1.7.4.2 diff -u -3 -p -r1.7.4.2 rl78.c @@ -1219,10 +1219,19 @@ rl78_expand_epilogue (void) for (i = 15; i = 0; i--) if (cfun-machine-need_to_push [i]) { + rtx dest = gen_rtx_REG (HImode, i * 2); + if (TARGET_G10) { - emit_insn (gen_pop (gen_rtx_REG (HImode, 0))); - emit_move_insn (gen_rtx_REG (HImode, i*2), gen_rtx_REG (HImode, 0)); + rtx ax = gen_rtx_REG (HImode, 0); + + emit_insn (gen_pop (ax)); + if (i != 0) + { + emit_move_insn (dest, ax); + /* Generate a USE of the pop'd register so that DCE will not eliminate the move. */ + emit_insn (gen_use (dest)); + } } else { @@ -1233,7 +1242,7 @@ rl78_expand_epilogue (void) emit_insn (gen_sel_rb (GEN_INT (need_bank))); rb = need_bank; } - emit_insn (gen_pop (gen_rtx_REG (HImode, i * 2))); + emit_insn (gen_pop (dest)); } }
Re: [PATCH] Add D demangling support to libiberty
On 4 August 2014 16:52, Ian Lance Taylor i...@google.com wrote: On Sun, Aug 3, 2014 at 11:12 AM, Iain Buclaw ibuc...@gdcproject.org wrote: - I haven't signed any copyright assignments to GCC. But I have papers from Donald ready to send across. Definitely necessary before we can consider this. Please get this squared away before proceeding with this patch. Let us know if you need any help with this. I've been informed by FSF assignments that this has now been processed. Is there anything left for me to address technically about this change? Iain.
Re: [PATCH, i386, Pointer Bounds Checker 33/x] MPX ABI
On 18 Sep 21:54, Uros Bizjak wrote: Hello! 2014-06-11 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c (ix86_option_override_internal): Do not support x32 with MPX. is not available. (init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt and force_bnd_pass. (function_arg_advance_32): Return number of used integer registers. (function_arg_advance_64): Likewise. (function_arg_advance_ms_64): Likewise. (ix86_function_arg_advance): Handle pointer bounds. (ix86_function_arg): Likewise. (ix86_function_value_regno_p): Mark fisrt bounds registers as possible function value. (ix86_function_value_1): Handle pointer bounds type/mode (ix86_return_in_memory): Likewise. (ix86_print_operand): Analyse insn to decide abounfbnd prefix. (ix86_expand_call): Generate returned bounds. (ix86_bnd_prefixed_insn_p): Check if we have instrumented call or function. * config/i386/i386.h (ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and stdarg fields. * config/i386/i386.md (UNSPEC_BNDRET): New. (*call_value): Add returned bounds. (*sibcall_value): Likewise. (*call_value_rex64_ms_sysv): Likewise. (*call_value_pop): Likewise. (*sibcall_value_pop): Likewise. * config/i386/predicates.md (call_rex64_ms_sysv_operation): Adjust to changed call patterns. -static void +static int function_arg_advance_32 (CUMULATIVE_ARGS *cum, enum machine_mode mode, const_tree type, HOST_WIDE_INT bytes, HOST_WIDE_INT words) Please also update function comments when function is changed. A couple of places. + exam = examine_argument (mode, type, 0, int_nregs, sse_nregs); - if (examine_argument (mode, type, 0, int_nregs, sse_nregs) + if (exam sse_nregs = cum-sse_nregs int_nregs = cum-nregs) { cum-nregs -= int_nregs; cum-sse_nregs -= sse_nregs; cum-regno += int_nregs; cum-sse_regno += sse_nregs; + return int_nregs; Please note that examine_argument was changed recently to return true if argument is to be passed in memory. The patch doesn't reflect that, please update the patch. Uros. Here is an updated version of this patch. Thanks, Ilya -- 2014-09-19 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c (ix86_option_override_internal): Do not support x32 with MPX. is not available. (init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt and force_bnd_pass. (function_arg_advance_32): Return number of used integer registers. (function_arg_advance_64): Likewise. (function_arg_advance_ms_64): Likewise. (ix86_function_arg_advance): Handle pointer bounds. (ix86_function_arg): Likewise. (ix86_function_value_regno_p): Mark fisrt bounds registers as possible function value. (ix86_function_value_1): Handle pointer bounds type/mode (ix86_return_in_memory): Likewise. (ix86_print_operand): Analyse insn to decide abounfbnd prefix. (ix86_expand_call): Generate returned bounds. (ix86_bnd_prefixed_insn_p): Check if we have instrumented call or function. * config/i386/i386.h (ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and stdarg fields. * config/i386/i386.md (UNSPEC_BNDRET): New. (*call_value): Add returned bounds. (*sibcall_value): Likewise. (*call_value_rex64_ms_sysv): Likewise. (*call_value_pop): Likewise. (*sibcall_value_pop): Likewise. * config/i386/predicates.md (call_rex64_ms_sysv_operation): Adjust to changed call patterns. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 4a58190..da7ffa8 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3712,6 +3712,9 @@ ix86_option_override_internal (bool main_args_p, if (TARGET_X32 (opts-x_ix86_isa_flags OPTION_MASK_ISA_MPX)) error (Intel MPX does not support x32); + if (TARGET_X32 (ix86_isa_flags OPTION_MASK_ISA_MPX)) +error (Intel MPX does not support x32); + if (!strcmp (opts-x_ix86_arch_string, generic)) error (generic CPU can be used only for %stune=%s %s, prefix, suffix, sw); @@ -6198,10 +6201,15 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, /* Argument info to initialize */ FIXME: once typesytem is fixed, we won't need this code anymore. */ if (i i-local i-can_change_signature) fntype = TREE_TYPE (fndecl); + cum-stdarg = fntype ? stdarg_p (fntype) : false; cum-maybe_vaarg = (fntype ? (!prototype_p (fntype) || stdarg_p (fntype)) : !libname); + cum-bnd_regno =
Re: [PATCH] Fix PR63152
thanks for your good question. I think it is equivalent, as it seems that GFC_DESCRIPTOR_TYPE_P (type) implies either sym-attr.allocatable or sym-attr.pointer. To check, I rank a check-fortran with the explicit patch below, and this made no difference. Code gen for a number of additional testcases involving alloc_comp and finalizers looked good as well. So, I think the original patch is still fine. OK to commit, then. Thanks for the thorough answer to my question. FX
Re: [C++ Patch] PR 62232
Hi, On 09/19/2014 08:03 AM, Gerald Pfeifer wrote: Hi Paolo, On Wed, 17 Sep 2014, Paolo Carlini wrote: clang recently, in 3.5.0 if I remember correctly, stopped -Wnon-virtual-dtor warning for final classes. I think it makes sense to do the same. mind adding a note to wwwdocs/htdocs/gcc-5/changes.html ? Sure, I will! Paolo.
Re: [PATCH 1/4] [AARCH64,NEON] Add patterns + builtins for vld[234](q?)_lane_* intrinsics
Hi Charles, Good to see these intrinsics being brought into the modern world :) Some style comments inline. On 18/09/14 20:38, Charles Baylis wrote: This patch adds new patterns and builtins to represent single lane structure loads instructions, which will be used to implement the vld[234](q?)_lane_* intrinsics. Tested (with the rest of the patch series) with make check on aarch64-oe-linux with qemu, and also causes no regressions in clyon's NEON intrinsics tests. DATE Charles Baylis charles.bay...@linaro.org * config/aarch64/aarch64-builtins.c (aarch64_types_loadstruct_lane_qualifiers): Define. * config/aarch64/aarch64-simd-builtins.def (ld2_lane, ld3_lane, ld4_lane): New builtins. * config/aarch64/aarch64-simd.md (vec_load_lanesoi_lanemode): New pattern. (vec_load_lanesci_lanemode): Likewise. (vec_load_lanesxi_lanemode): Likewise. (aarch64_ld2_laneVQ:mode): New expand. (aarch64_ld3_laneVQ:mode): Likewise. (aarch64_ld4_laneVQ:mode): Likewise. This is missing an entry for the config/aarch64/aarch64.md hunk. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 493e886..f6c4018 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4003,6 +4003,18 @@ [(set_attr type neon_load2_2regq)] ) +(define_insn vec_load_lanesoi_lanemode + [(set (match_operand:OI 0 register_operand =w) + (unspec:OI [(match_operand:V_TWO_ELEM 1 aarch64_simd_struct_operand Utv) + (match_operand:OI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY) ] + UNSPEC_LD2_LANE))] + TARGET_SIMD + ld2\\t{%S0.Vetype - %T0.Vetype}[%3], %1 + [(set_attr type neon_load2_one_laneq)] +) The VQ mode iterator goes over the 128-wide modes so the type attribute here will always be neon_load2_one_lane_q. Using the q mode attribute is still correct but personally I think it makes it just that little bit harder to figure out for a newbie who will have to open iterators.md to figure out the meaning of it, or for someone who's not sure whether the 'q' is added with an underscore or without. I would just use neon_load2_one_lane_q. +(define_insn vec_load_lanesci_lanemode + [(set (match_operand:CI 0 register_operand =w) + (unspec:CI [(match_operand:V_THREE_ELEM 1 aarch64_simd_struct_operand Utv) + (match_operand:CI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD3_LANE))] + TARGET_SIMD + ld3\\t{%S0.Vetype - %U0.Vetype}[%3], %1 + [(set_attr type neon_load3_one_laneq)] +) Likewise. +(define_insn vec_load_lanesxi_lanemode + [(set (match_operand:XI 0 register_operand =w) + (unspec:XI [(match_operand:V_FOUR_ELEM 1 aarch64_simd_struct_operand Utv) + (match_operand:XI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD4_LANE))] + TARGET_SIMD + ld4\\t{%S0.Vetype - %V0.Vetype}[%3], %1 + [(set_attr type neon_load4_one_laneq)] +) Same here. +(define_expand aarch64_ld2_laneVQ:mode + [(match_operand:OI 0 register_operand =w) + (match_operand:DI 1 register_operand w) + (match_operand:OI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + TARGET_SIMD +{ + enum machine_mode mode = V_TWO_ELEMmode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + operands[3] = GEN_INT (ENDIAN_LANE_N (MODEmode, INTVAL (operands[3]))); + + emit_insn (gen_vec_load_lanesoi_laneVQ:mode (operands[0], + mem, + operands[2], + operands[3])); + DONE; +}) I think saying VQ:mode is redundant since VQ is the only mode iterator in the pattern. Just mode should work, right? + +(define_expand aarch64_ld3_laneVQ:mode + [(match_operand:CI 0 register_operand =w) + (match_operand:DI 1 register_operand w) + (match_operand:CI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + TARGET_SIMD +{ + enum machine_mode mode = V_THREE_ELEMmode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + operands[3] = GEN_INT (ENDIAN_LANE_N (MODEmode, INTVAL (operands[3]))); + + emit_insn (gen_vec_load_lanesci_laneVQ:mode (operands[0], + mem, + operands[2], + operands[3])); + DONE; +}) Likewise. +
Re: [patch] flag .persistent.bss sections as bss
On Sep 19, 2014, at 01:22 , Jeff Law l...@redhat.com wrote: * varasm.c (default_section_type_flags): Flag .persistent.bss sections as SECTION_BSS. Ok. Jeff Thanks for this review and the other ones Jeff :-)
Re: [PATCH AArch64]: Add constraint letter for stack_protect_test pattern)
On 18 September 2014 11:15, James Greenhalgh james.greenha...@arm.com wrote: gcc/ 2014-09-18 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.md (stack_protect_test_mode): Mark scratch register as an output to placate register renaming. gcc/testsuite/ 2014-09-18 James Greenhalgh james.greenha...@arm.com * gcc.dg/ssp-3.c: New. * gcc.dg/ssp-4.c: Likewise. OK, thanks /Marcus
Re: parallel check output changes?
On Thu, Sep 18, 2014 at 01:44:55PM -0500, Segher Boessenkool wrote: I am testing a patch that is just diff --git a/contrib/dg-extract-results.py b/contrib/dg-extract-results.py index cccbfd3..3781423 100644 --- a/contrib/dg-extract-results.py +++ b/contrib/dg-extract-results.py @@ -117,7 +117,7 @@ class Prog: self.tool_re = re.compile (r'^\t\t=== (.*) tests ===$') self.result_re = re.compile (r'^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED' r'|WARNING|ERROR|UNSUPPORTED|UNTESTED' - r'|KFAIL):\s*(\S+)') + r'|KFAIL):\s*(.+)') self.completed_re = re.compile (r'.* completed at (.*)') # Pieces of text to write at the head of the output. # start_line is a pair in which the first element is a datetime Tested that with four runs on powerpc64-linux, four configs each time; test-summary shows the same in all cases. Many lines have moved compared to without the patch, but that cannot be helped. Okay for mainline? 2014-09-19 Segher Boessenkool seg...@kernel.crashing.org contrib/ * dg-extract-results.py (Prog.result_re): Include options in test name. Relatedly, is it just me or are most lines of the test summaries (the # lines after ===) missing since the parallelisation patches? This is still open. I also did some timings for make -j60 -k check, same -m64,-m32,-m32/-mpowerpc64, -m64/-mlra configs. A run takes 65m, is effectively 42x parallel, and has 15% system time. Segher
Re: [PATCH, PR63229] fix assert in hash_table pch routines
tsaund...@mozilla.com writes: diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c048672..5b27aa8 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-09-12 Trevor Saunders tsaund...@mozilla.com + + * hash-table.h (gt_pch_nx): don't call gt_pch_note_object within an + assert. + 2014-09-12 Joseph Myers jos...@codesourcery.com * target.def (libgcc_floating_mode_supported_p): New hook. diff --git a/gcc/hash-table.h b/gcc/hash-table.h index c2a68fd..028b7de 100644 --- a/gcc/hash-table.h +++ b/gcc/hash-table.h @@ -1598,8 +1598,9 @@ templatetypename D static void gt_pch_nx (hash_tableD *h) { - gcc_checking_assert (gt_pch_note_object (h-m_entries, h, -hashtab_entry_note_pointersD)); + bool success ATTRIBUTE_UNUSED += gt_pch_note_object (h-m_entries, h, hashtab_entry_note_pointersD); + gcc_checking_assert (success); Do we need ATTRIBUTE_UNUSED here? I thought we tried to define gcc_checking_assert so that its argument appeared used even when asserts were disabled. (Sorry for the nit.) Thanks, Richard
Re: [Patch] Teach genrecog/genoutput that scratch registers require write constraint modifiers
On Thu, Sep 18, 2014 at 09:45:59PM +0100, Jeff Law wrote: On 09/18/14 04:19, James Greenhalgh wrote: Hi, As discussed in https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01334.html The construct (clobber (match_scratch 0 r)) is invalid - operand 0 must be marked either write or read/write. Likewise (match_* 0 r) is invalid, marking an operand earlyclobber does not remove the need to also mark it write or read/write. This patch adds checking for these two error conditions to the generator programs and documents the restriction. Bootstrapped on x86, ARM and AArch64 with no new issues. Ok? Thanks, James --- 2014-09-17 James Greenhalgh james.greenha...@arm.com * doc/md.texi (Modifiers): Consistently use read/write nomenclature rather than input/output. * genrecog.c (constraints_supported_in_insn_p): New. (validate_pattern): If needed, also check constraints on MATCH_SCRATCH operands. * genoutput.c (validate_insn_alternatives): Catch earlyclobber operands with no '=' or '+' modifier. + if (c == '=' || c == '+') + seen_inout = true; Isn't seen_input poorly named here? ISTM what we're checking for is if we've seen an operand that will be written to. Doesn't it make sense to use read/write nomenclature in the comments and variable names too? So with those nits fixed, this patch is OK. Thanks Jeff, In the end I committed the attached as revision 215388. Cheers, James Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 215387) +++ gcc/ChangeLog (working copy) @@ -1,5 +1,15 @@ 2014-09-19 James Greenhalgh james.greenha...@arm.com + * doc/md.texi (Modifiers): Consistently use read/write + nomenclature rather than input/output. + * genrecog.c (constraints_supported_in_insn_p): New. + (validate_pattern): If needed, also check constraints on + MATCH_SCRATCH operands. + * genoutput.c (validate_insn_alternatives): Catch earlyclobber + operands with no '=' or '+' modifier. + +2014-09-19 James Greenhalgh james.greenha...@arm.com + * config/aarch64/aarch64.md (stack_protect_test_mode): Mark scratch register as written. Index: gcc/genoutput.c === --- gcc/genoutput.c (revision 215387) +++ gcc/genoutput.c (working copy) @@ -769,6 +769,7 @@ char c; int which_alternative = 0; int alternative_count_unsure = 0; + bool seen_write = false; for (p = d-operand[start].constraint; (c = *p); p += len) { @@ -777,6 +778,18 @@ error_with_line (d-lineno, character '%c' can only be used at the beginning of a constraint string, c); + + if (c == '=' || c == '+') + seen_write = true; + + /* Earlyclobber operands must always be marked write-only + or read/write. */ + if (!seen_write c == '') + error_with_line (d-lineno, + earlyclobber operands may not be + read-only in alternative %d, + which_alternative); + if (ISSPACE (c) || strchr (indep_constraints, c)) len = 1; else if (ISDIGIT (c)) Index: gcc/genrecog.c === --- gcc/genrecog.c (revision 215387) +++ gcc/genrecog.c (working copy) @@ -415,7 +415,19 @@ return NULL; } +/* In DEFINE_EXPAND, DEFINE_SPLIT, and DEFINE_PEEPHOLE2, we + don't use the MATCH_OPERAND constraint, only the predicate. + This is confusing to folks doing new ports, so help them + not make the mistake. */ +static bool +constraints_supported_in_insn_p (rtx insn) +{ + return !(GET_CODE (insn) == DEFINE_EXPAND + || GET_CODE (insn) == DEFINE_SPLIT + || GET_CODE (insn) == DEFINE_PEEPHOLE2); +} + /* Check for various errors in patterns. SET is nonnull for a destination, and is the complete set pattern. SET_CODE is '=' for normal sets, and '+' within a context that requires in-out constraints. */ @@ -432,7 +444,32 @@ switch (code) { case MATCH_SCRATCH: - return; + { + const char constraints0 = XSTR (pattern, 1)[0]; + + if (!constraints_supported_in_insn_p (insn)) + { + if (constraints0) + { + error_with_line (pattern_lineno, + constraints not supported in %s, + rtx_name[GET_CODE (insn)]); + } + return; + } + + /* If a MATCH_SCRATCH is used in a context requiring an write-only + or read/write register, validate that. */ + if (set_code == '=' + constraints0 != '=' + constraints0 != '+') + { + error_with_line (pattern_lineno, + operand %d missing output reload, + XINT (pattern, 0)); + } + return; + } case MATCH_DUP: case MATCH_OP_DUP: case MATCH_PAR_DUP: @@ -467,18 +504,14 @@ { const char constraints0 = XSTR (pattern, 2)[0]; - /* In DEFINE_EXPAND, DEFINE_SPLIT, and DEFINE_PEEPHOLE2, we - don't use the
Re: [PATCH][PING] Keep patch file permissions in mklog
On 18-09-14 19:46, Diego Novillo wrote: On Thu, Sep 18, 2014 at 10:56 AM, Yury Gribov y.gri...@samsung.com wrote: On 08/04/2014 12:14 PM, Tom de Vries wrote: On 04-08-14 08:45, Yury Gribov wrote: Thanks! My 2 (actually 4) cents below. Hi Yuri, thanks for the review. +if ($#ARGV == 1 ($ARGV[0] eq -i || $ARGV[0] eq --inline)) { +$diff = $ARGV[1]; Can we shift here and then just set $diff to $ARGV[0] unconditionally? Done. +if ($diff eq -) { +die Reading from - and using -i are not compatible; +} Hm, can't we dump ChangeLog to stdout in that case? The limitation looks rather strange. My original idea here was that --inline means 'in the patch file', which is not possible if the patch comes from stdin. I've now interpreted it such that --inline prints to stdout what it would print to the patch file otherwise, that is, both log and patch. Printing just the log to stdout can be already be achieved by not using --inline. +open (FILE1, '', $tmp) or die Could not open temp file; Could we use more descriptive name? I've used the slightly more descriptive 'OUTPUTFILE'. +system (cat $diff $tmp) == 0 +or die Could not append patch to temp file; ... +unlink ($tmp) == 1 or die Could not remove temp file; The checks look like an overkill given that we don't check for result of mktemp... I've added a check for the result of mktemp, and removed the unlink result check. I've left in the Could not append patch to temp file check because the patch file might be read-only. OK for trunk? Thanks, - Tom Pinging the patch for Tom. Apologies for the delay. Could someone post the latest patch. I see it's gone through a cycle of reviews and changes. Hi Diego, here it is. I've followed up on the explanation by Segher about 2.15 File module version and fixed the comment. I've not added the 2.15 file module version check on copy Segher also mentioned, since I'm not sure about that one. AFAIU the tradeoff is: - without the check the mklog still works ok for older versions of perl, apart from the permissions (the situation we're in for all versions of perl without the patch) - with the check the script won't work at all for older perl version. So it's a question of predictability (always do the same or do nothing) vs. robustness (do as much as you can given the circumstances). I'm not sure which one is better in this case. Thanks, - Tom 2014-08-11 Tom de Vries t...@codesourcery.com * mklog: Add --inline option. diff --git a/contrib/mklog b/contrib/mklog index 3d17dc5..1352e99 100755 --- a/contrib/mklog +++ b/contrib/mklog @@ -26,6 +26,9 @@ # Author: Diego Novillo dnovi...@google.com and # Cary Coutant ccout...@google.com +use File::Temp; +use File::Copy qw(cp mv); + # Change these settings to reflect your profile. $username = $ENV{'USER'}; $name = `finger $username | grep -o 'Name: .*'`; @@ -56,14 +59,22 @@ if (-d $gcc_root/.git) { # Program starts here. You should not need to edit anything below this # line. #- -if ($#ARGV != 0) { +$inline = 0; +if ($#ARGV == 1 ($ARGV[0] eq -i || $ARGV[0] eq --inline)) { + shift; + $inline = 1; +} elsif ($#ARGV != 0) { $prog = `basename $0`; chop ($prog); print EOF; -usage: $prog file.diff +usage: $prog [ -i | --inline ] file.diff Generate ChangeLog template for file.diff. It assumes that patch has been created with -up or -cp. +When -i is used, the ChangeLog template is followed by the contents of +file.diff. When file.diff is -, read standard input. +When -i is used and file.diff is not -, it writes to file.diff, otherwise it +writes to stdout. EOF exit 1; } @@ -273,8 +284,39 @@ foreach (@diff_lines) { # functions. $cl_entries{$clname} .= $change_msg ? $change_msg\n : :\n; +if ($inline $diff ne -) { + # Get a temp filename, rather than an open filehandle, because we use + # the open to truncate. + $tmp = mktemp(tmp.) or die Could not create temp file: $!; + + # Copy the permissions to the temp file (in File module version 2.15 and + #later). + cp $diff, $tmp or die Could not copy patch file to temp file: $!; + + # Open the temp file, clearing contents. + open (OUTPUTFILE, '', $tmp) or die Could not open temp file: $!; +} else { + *OUTPUTFILE = STDOUT; +} + +# Print the log foreach my $clname (keys %cl_entries) { - print $clname:\n\n$hdrline\n\n$cl_entries{$clname}\n; + print OUTPUTFILE $clname:\n\n$hdrline\n\n$cl_entries{$clname}\n; +} + +if ($inline) { + # Append the patch to the log + foreach (@diff_lines) { + print OUTPUTFILE $_\n; + } +} + +if ($inline $diff ne -) { + # Close $tmp + close(OUTPUTFILE); + + # Write new contents to $diff atomically + mv $tmp, $diff or die Could not move temp file to patch file: $!; } exit 0; -- 1.9.1
Re: [PATCH 1/4] [AARCH64,NEON] Add patterns + builtins for vld[234](q?)_lane_* intrinsics
+(define_expand aarch64_ld2_laneVQ:mode + [(match_operand:OI 0 register_operand =w) + (match_operand:DI 1 register_operand w) + (match_operand:OI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + TARGET_SIMD +{ + enum machine_mode mode = V_TWO_ELEMmode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + operands[3] = GEN_INT (ENDIAN_LANE_N (MODEmode, INTVAL (operands[3]))); + The endianess lane correction breaks this for BE. You don't need the endianess lane correction here - we always call neon intrinsics with the architectural lane number - irrespective of endianness. Unless ofcourse you flip it somewhere to make it a part of RTL vec_select lane patterns, which you don't here. You could also do some lane-bounds checking here in the expander. + emit_insn (gen_vec_load_lanesoi_laneVQ:mode (operands[0], + mem, + operands[2], + operands[3])); + DONE; +}) + +(define_expand aarch64_ld3_laneVQ:mode + [(match_operand:CI 0 register_operand =w) + (match_operand:DI 1 register_operand w) + (match_operand:CI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + TARGET_SIMD +{ + enum machine_mode mode = V_THREE_ELEMmode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + operands[3] = GEN_INT (ENDIAN_LANE_N (MODEmode, INTVAL (operands[3]))); + No endianness correction for lanes necessary. + emit_insn (gen_vec_load_lanesci_laneVQ:mode (operands[0], + mem, + operands[2], + operands[3])); + DONE; +}) + +(define_expand aarch64_ld4_laneVQ:mode + [(match_operand:XI 0 register_operand =w) + (match_operand:DI 1 register_operand w) + (match_operand:XI 2 register_operand 0) + (match_operand:SI 3 immediate_operand i) + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + TARGET_SIMD +{ + enum machine_mode mode = V_FOUR_ELEMmode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + operands[3] = GEN_INT (ENDIAN_LANE_N (MODEmode, INTVAL (operands[3]))); + Same. + emit_insn (gen_vec_load_lanesxi_laneVQ:mode (operands[0], + mem, + operands[2], + operands[3])); + DONE; +}) + + + ;; Expanders for builtins to extract vector registers from large ;; opaque integer modes. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c60038a..ea924ab 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -92,6 +92,9 @@ UNSPEC_LD2 UNSPEC_LD3 UNSPEC_LD4 +UNSPEC_LD2_LANE +UNSPEC_LD3_LANE +UNSPEC_LD4_LANE UNSPEC_MB UNSPEC_NOP UNSPEC_PRLG_STK Thanks, Tejas.
Re: [AArch64] Tighten predicates on SIMD shift intrinsics
*Ping* Thanks, James On Thu, Sep 11, 2014 at 09:29:52AM +0100, James Greenhalgh wrote: gcc/ 2014-09-11 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64-protos.h (aarch64_simd_const_bounds): Change return type to bool. * config/aarch64/aarch64-simd.md (aarch64_surqrshlmode): Use new predicates. (aarch64_surshll2_nmode): Likewise. (aarch64_surshr_nmode): Likewise. (aarch64_sursra_nmode: Likewise. (aarch64_surslri_nmode): Likewise. (aarch64_surqshlu_nmode): Likewise. * config/aarch64/aarch64.c (aarch64_simd_const_bounds): Change return type to bool; don't print errors. * config/aarch64/iterators.md (ve_mode): New. (offsetlr): Remap to infix text for use in new predicates. * config/aarch64/predicates.md (aarch64_simd_shift_imm_qi): New. (aarch64_simd_shift_imm_hi): Likewise. (aarch64_simd_shift_imm_si): Likewise. (aarch64_simd_shift_imm_di): Likewise. (aarch64_simd_shift_imm_offset_qi): Likewise. (aarch64_simd_shift_imm_offset_hi): Likewise. (aarch64_simd_shift_imm_offset_si): Likewise. (aarch64_simd_shift_imm_offset_di): Likewise. (aarch64_simd_shift_imm_bitsize_qi): Likewise. (aarch64_simd_shift_imm_bitsize_hi): Likewise. (aarch64_simd_shift_imm_bitsize_si): Likewise. (aarch64_simd_shift_imm_bitsize_di): Likewise. gcc/testsuite/ 2014-09-08 James Greenhalgh james.greenha...@arm.com * gcc.target/aarch64/simd/vqshlb_1.c: New. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 35f89ff..9de7af7 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -205,6 +205,7 @@ bool aarch64_regno_ok_for_base_p (int, bool); bool aarch64_regno_ok_for_index_p (int, bool); bool aarch64_simd_check_vect_par_cnst_half (rtx op, enum machine_mode mode, bool high); +bool aarch64_simd_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT); bool aarch64_simd_imm_scalar_p (rtx x, enum machine_mode mode); bool aarch64_simd_imm_zero_p (rtx, enum machine_mode); bool aarch64_simd_scalar_immediate_valid_for_move (rtx, enum machine_mode); @@ -255,7 +256,6 @@ void aarch64_emit_call_insn (rtx); /* Initialize builtins for SIMD intrinsics. */ void init_aarch64_simd_builtins (void); -void aarch64_simd_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT); void aarch64_simd_disambiguate_copy (rtx *, rtx *, rtx *, unsigned int); /* Emit code to place a AdvSIMD pair result in memory locations (with equal diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 6a45e91512ffe1c8c2ecd2b1ba4336baf87f7256..9e688e310027c772cfe5ecd4a158796b143998c5 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3715,12 +3715,12 @@ (define_insn aarch64_surqrshlmode (define_insn aarch64_surshll_nmode [(set (match_operand:VWIDE 0 register_operand =w) (unspec:VWIDE [(match_operand:VDW 1 register_operand w) - (match_operand:SI 2 immediate_operand i)] + (match_operand:SI 2 +aarch64_simd_shift_imm_bitsize_ve_mode i)] VSHLL))] TARGET_SIMD * int bit_width = GET_MODE_UNIT_SIZE (MODEmode) * BITS_PER_UNIT; - aarch64_simd_const_bounds (operands[2], 0, bit_width + 1); if (INTVAL (operands[2]) == bit_width) { return \shll\\t%0.Vwtype, %1.Vtype, %2\; @@ -3741,7 +3741,6 @@ (define_insn aarch64_surshll2_nmode TARGET_SIMD * int bit_width = GET_MODE_UNIT_SIZE (MODEmode) * BITS_PER_UNIT; - aarch64_simd_const_bounds (operands[2], 0, bit_width + 1); if (INTVAL (operands[2]) == bit_width) { return \shll2\\t%0.Vwtype, %1.Vtype, %2\; @@ -3757,13 +3756,11 @@ (define_insn aarch64_surshll2_nmode (define_insn aarch64_surshr_nmode [(set (match_operand:VSDQ_I_DI 0 register_operand =w) (unspec:VSDQ_I_DI [(match_operand:VSDQ_I_DI 1 register_operand w) -(match_operand:SI 2 immediate_operand i)] +(match_operand:SI 2 + aarch64_simd_shift_imm_offset_ve_mode i)] VRSHR_N))] TARGET_SIMD - * - int bit_width = GET_MODE_UNIT_SIZE (MODEmode) * BITS_PER_UNIT; - aarch64_simd_const_bounds (operands[2], 1, bit_width + 1); - return \surshr\\t%v0Vmtype, %v1Vmtype, %2\; + surshr\\t%v0Vmtype, %v1Vmtype, %2 [(set_attr type neon_sat_shift_immq)] ) @@ -3773,13 +3770,11 @@ (define_insn aarch64_sursra_nmode [(set (match_operand:VSDQ_I_DI 0 register_operand =w) (unspec:VSDQ_I_DI [(match_operand:VSDQ_I_DI 1 register_operand 0) (match_operand:VSDQ_I_DI 2 register_operand w) - (match_operand:SI 3 immediate_operand i)] +
Re: [PATCHv4] Vimrc config with GNU formatting
On Fri, Sep 19, 2014 at 08:17:28AM +0400, Yury Gribov wrote: I was talking about mbr's plugin here :-) Ah, ok. Then I'll mention thinca's plugin as a secondary option with a disclaimer then. Why? There are more plugins that also do the same thing, all more popular (on vim.org at least), all less dangerous (well, many are probably just as bad :-( ). I was suggesting you could write it as :set cino=4,n-2,{2,^-2,:2,=2,g0,f0,h2,p4,t0,+2,(0,u0,w1,m0 and you'd be independent of sw setting. ... And yeah sw=2 does make sense for editing GCC, if you are used to sw=2 that is. The point is that the sw setting has nothing to do with what your text will look like, only with what keys you press. Depending on whether you treat shiftwidth as amount of spaces that is inserted when I press or default indent for a particular class of files. For example with shiftwidth=2 user could (un)indent block of C code with or which seems to be useful. Vim treats it as the former (it has no concept of the latter). cindent uses it too, for that s thing, but you do not have to use it here anyway since our coding standard requires two spaces so we can just hardcode that in cino; and in that case, a user can use any sw he wants / is used to. Using s in cino is useful if you want the layout to change if you change the shiftwidth. But we don't. Segher
Re: [PATCH 2/4] [AARCH64,NEON] Convert arm_neon.h to use new builtins for vld[234](q?)_lane_*
On 18/09/14 20:38, Charles Baylis wrote: This patch replaces the inline assembler implementations of the vld[234](q?)_lane_* intrinsics with new versions which exploit the new builtin functions added in patch 1. Tested (with the rest of the patch series) with make check on aarch64-oe-linux with qemu, and also causes no regressions in clyon's NEON intrinsics tests. DATE Charles Baylis charles.bay...@linaro.org * config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Rewrite using builtins, update uses to use new macro arguments. (__LD3_LANE_FUNC): Likewise. (__LD4_LANE_FUNC): Likewise. Change-Id: I3bd5934b5c4f6127088193c1ab12848144d5540a --- gcc/config/aarch64/arm_neon.h | 359 -- 1 file changed, 237 insertions(+), 122 deletions(-) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index e62c783..c1fcb47 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -11805,47 +11805,79 @@ __LD2R_FUNC (uint16x8x2_t, uint16x2_t, uint16_t, 8h, u16, q) __LD2R_FUNC (uint32x4x2_t, uint32x2_t, uint32_t, 4s, u32, q) __LD2R_FUNC (uint64x2x2_t, uint64x2_t, uint64_t, 2d, u64, q) -#define __LD2_LANE_FUNC(rettype, ptrtype, regsuffix, \ - lnsuffix, funcsuffix, Q)\ - __extension__ static __inline rettype \ - __attribute__ ((__always_inline__)) \ - vld2 ## Q ## _lane_ ## funcsuffix (const ptrtype *ptr, \ -rettype b, const int c)\ - {\ -rettype result;\ -__asm__ (ld1 {v16. #regsuffix , v17. #regsuffix }, %1\n\t\ -ld2 {v16. #lnsuffix , v17. #lnsuffix }[%3], %2\n\t \ -st1 {v16. #regsuffix , v17. #regsuffix }, %0\n\t \ -: =Q(result) \ -: Q(b), Q(*(const rettype *)ptr), i(c) \ -: memory, v16, v17); \ -return result; \ - } - -__LD2_LANE_FUNC (int8x8x2_t, uint8_t, 8b, b, s8,) -__LD2_LANE_FUNC (float32x2x2_t, float32_t, 2s, s, f32,) -__LD2_LANE_FUNC (float64x1x2_t, float64_t, 1d, d, f64,) -__LD2_LANE_FUNC (poly8x8x2_t, poly8_t, 8b, b, p8,) -__LD2_LANE_FUNC (poly16x4x2_t, poly16_t, 4h, h, p16,) -__LD2_LANE_FUNC (int16x4x2_t, int16_t, 4h, h, s16,) -__LD2_LANE_FUNC (int32x2x2_t, int32_t, 2s, s, s32,) -__LD2_LANE_FUNC (int64x1x2_t, int64_t, 1d, d, s64,) -__LD2_LANE_FUNC (uint8x8x2_t, uint8_t, 8b, b, u8,) -__LD2_LANE_FUNC (uint16x4x2_t, uint16_t, 4h, h, u16,) -__LD2_LANE_FUNC (uint32x2x2_t, uint32_t, 2s, s, u32,) -__LD2_LANE_FUNC (uint64x1x2_t, uint64_t, 1d, d, u64,) -__LD2_LANE_FUNC (float32x4x2_t, float32_t, 4s, s, f32, q) -__LD2_LANE_FUNC (float64x2x2_t, float64_t, 2d, d, f64, q) -__LD2_LANE_FUNC (poly8x16x2_t, poly8_t, 16b, b, p8, q) -__LD2_LANE_FUNC (poly16x8x2_t, poly16_t, 8h, h, p16, q) -__LD2_LANE_FUNC (int8x16x2_t, int8_t, 16b, b, s8, q) -__LD2_LANE_FUNC (int16x8x2_t, int16_t, 8h, h, s16, q) -__LD2_LANE_FUNC (int32x4x2_t, int32_t, 4s, s, s32, q) -__LD2_LANE_FUNC (int64x2x2_t, int64_t, 2d, d, s64, q) -__LD2_LANE_FUNC (uint8x16x2_t, uint8_t, 16b, b, u8, q) -__LD2_LANE_FUNC (uint16x8x2_t, uint16_t, 8h, h, u16, q) -__LD2_LANE_FUNC (uint32x4x2_t, uint32_t, 4s, s, u32, q) -__LD2_LANE_FUNC (uint64x2x2_t, uint64_t, 2d, d, u64, q) +#define __LD2_LANE_FUNC(intype, vectype, largetype, ptrtype, \ +mode, ptrmode, funcsuffix, signedtype)\ +__extension__ static __inline intype __attribute__ ((__always_inline__)) \ +vld2_lane_##funcsuffix (const ptrtype * __ptr, intype __b, const int __c) \ +{ \ + __builtin_aarch64_simd_oi __o; \ + largetype __temp; \ + __temp.val[0] = \ +vcombine_##funcsuffix (__b.val[0], vcreate_##funcsuffix (0)); \ + __temp.val[1] = \ +vcombine_##funcsuffix (__b.val[1], vcreate_##funcsuffix (0)); \ + __o = __builtin_aarch64_set_qregoi##mode (__o, \ + (signedtype) __temp.val[0], \ + 0); \ + __o = __builtin_aarch64_set_qregoi##mode (__o, \ + (signedtype) __temp.val[1], \ + 1); \ + __o =__builtin_aarch64_ld2_lane##mode (
Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge
On 19/09/14 06:51, Jeff Law wrote: On 09/16/14 00:55, Zhenqiang Chen wrote: -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Jiong Wang Sent: Monday, September 15, 2014 11:28 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org; Jeff Law Subject: Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge On 08/05/14 09:07, Zhenqiang Chen wrote: static bool move_insn_for_shrink_wrap (basic_block bb, rtx insn, const HARD_REG_SET uses, const HARD_REG_SET defs, - HARD_REG_SET *last_uses) + HARD_REG_SET *last_uses, + bool *split_p) { rtx set, src, dest; bitmap live_out, live_in, bb_uses, bb_defs; unsigned int i, dregno, end_dregno, sregno, end_sregno; basic_block next_block; + edge live_edge; /* Look for a simple register copy. */ set = single_set (insn); @@ -5582,17 +5589,31 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, || overlaps_hard_reg_set_p (defs, GET_MODE (dest), dregno)) return false; - /* See whether there is a successor block to which we could move INSN. */ - next_block = next_block_for_reg (bb, dregno, end_dregno); - if (!next_block) + live_edge = next_block_for_reg (bb, dregno, end_dregno); if + (!live_edge) return false; + next_block = live_edge-dest; + /* If the destination register is referred in later insn, try to forward it. */ if (overlaps_hard_reg_set_p (*last_uses, GET_MODE (dest), dregno) !try_copy_prop (bb, insn, src, dest, last_uses)) return false; + /* Create a new basic block on the edge. */ if (EDGE_COUNT + (next_block-preds) == 2) +{ + next_block = split_edge (live_edge); + + bitmap_copy (df_get_live_in (next_block), df_get_live_out + (bb)); (re-sent, looks like the first send not received by the server...) for the function _IO_wdefault_xsputn in glibc wgenops.c (.dot file attached) 174: NOTE_INSN_BASIC_BLOCK 21 are the new created basic block because of the sink of move instruction ax:DI = 0 and split edge. but the live_in of this BB is copied from live_out of BB 2 which is too big, and actually prevent the later sink of 16: r12:DI=dx:DI. Should it be better to copy live_in from next_block-next_bb instead of live_out from bb, as it will model what's needed more accurately? According to the algorithm, next_block-next_bb (which is live_edge-dest) should have two predecessors. Live_in of next_block-next_bb would include live_out of the other edge, which is not necessary accurately. To be more accurate, you may need an intersection of df_get_live_out (bb) and df_get_live_in (next_block-next_bb). Presumably we can't recompute the liveness info here as it's too expensive to do that each time we split an edge to sink a statement? ie, we need the more accurate liveness within this pass so that we can sink further instructions? for a single case, x86 bootstrap, looks like these code are not compile time critical as for all 4762 live edges which could sink instructions 4139: EDGE_COUNT (next_block-preds) == 1 270: EDGE_COUNT (next_block-preds) == 2 353: EDGE_COUNT (next_block-preds) 2 and if we make the live info more accurate here, just very few additional functions ( 10) shrink-wrapped from glibc build gcc bootstrap test. So, is it OK we simply change bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); into bitmap_and (df_get_live_in (next_block), df_get_live_out (bb), df_get_live_in (next_block-next_bb)); ? -- Jiong jeff
Re: Fix libgomp crash without TLS (PR42616)
I've corrected my patch accordingly to what you said. To diffirentiate second case in destructor I've added pthread_setspecific (gomp_tls_key, NULL) at the end of gomp_thread_start. So, destructor can simply skip the case when pthread_getspecific (gomp_tls_key) returns 0. I also think that it's better to set 0 in gomp_thread_start explicitly as thread data is initialized by a local variable in this function. But, I see that pthread_getspecific always returns 0 in destrucor because data pointer is implicitly set to 0 before destructor call in glibc: (pthread_create.c): /* Always clear the data. */ level2[inner].data = NULL; /* Make sure the data corresponds to a valid key. This test fails if the key was deallocated and also if it was re-allocated. It is the user's responsibility to free the memory in this case. */ if (level2[inner].seq == __pthread_keys[idx].seq /* It is not necessary to register a destructor function. */ __pthread_keys[idx].destr != NULL) /* Call the user-provided destructor. */ __pthread_keys[idx].destr (data); I suppose it's not necessary if everything is cleaned up in gomp_thread_start and destructor. What do you think? Changes are bootstrapped and regtested on x86_64-linux. 2014-09-19 Varvara Rainchik varvara.rainc...@intel.com * libgomp.h (gomp_thread): For non TLS case create thread data. * team.c (non_tls_thread_data_destructor, create_non_tls_thread_data): New functions. --- diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index bcd5b34..2f33d99 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -467,9 +467,15 @@ static inline struct gomp_thread *gomp_thread (void) } #else extern pthread_key_t gomp_tls_key; -static inline struct gomp_thread *gomp_thread (void) +extern struct gomp_thread *create_non_tls_thread_data (void); +static struct gomp_thread *gomp_thread (void) { - return pthread_getspecific (gomp_tls_key); + struct gomp_thread *thr = pthread_getspecific (gomp_tls_key); + if (thr == NULL) + { +thr = create_non_tls_thread_data (); + } + return thr; } #endif diff --git a/libgomp/team.c b/libgomp/team.c index e6a6d8f..a692df8 100644 --- a/libgomp/team.c +++ b/libgomp/team.c @@ -41,6 +41,7 @@ pthread_key_t gomp_thread_destructor; __thread struct gomp_thread gomp_tls_data; #else pthread_key_t gomp_tls_key; +struct gomp_thread initial_thread_tls_data; #endif @@ -130,6 +131,7 @@ gomp_thread_start (void *xdata) gomp_sem_destroy (thr-release); thr-thread_pool = NULL; thr-task = NULL; + pthread_setspecific (gomp_tls_key, NULL); return NULL; } @@ -222,8 +224,16 @@ gomp_free_pool_helper (void *thread_pool) void gomp_free_thread (void *arg __attribute__((unused))) { - struct gomp_thread *thr = gomp_thread (); - struct gomp_thread_pool *pool = thr-thread_pool; + struct gomp_thread *thr; + struct gomp_thread_pool *pool; +#ifdef HAVE_TLS + thr = gomp_thread (); +#else + thr = pthread_getspecific (gomp_tls_key); + if (thr == NULL) +return; +#endif + pool = thr-thread_pool; if (pool) { if (pool-threads_used 0) @@ -910,6 +920,21 @@ gomp_team_end (void) } } +/* Destructor for data created in create_non_tls_thread_data. */ + +#ifndef HAVE_TLS +void +non_tls_thread_data_destructor (void *arg __attribute__((unused))) +{ + struct gomp_thread *thr = pthread_getspecific (gomp_tls_key); + if (thr != NULL thr != initial_thread_tls_data) + { +gomp_free_thread (arg); +free (thr); +pthread_setspecific (gomp_tls_key, NULL); + } +} +#endif /* Constructors for this file. */ @@ -917,9 +942,7 @@ static void __attribute__((constructor)) initialize_team (void) { #ifndef HAVE_TLS - static struct gomp_thread initial_thread_tls_data; - - pthread_key_create (gomp_tls_key, NULL); + pthread_key_create (gomp_tls_key, non_tls_thread_data_destructor); pthread_setspecific (gomp_tls_key, initial_thread_tls_data); #endif @@ -927,6 +950,19 @@ initialize_team (void) gomp_fatal (could not create thread pool destructor.); } +/* Create data for thread created by pthread_create. */ + +#ifndef HAVE_TLS +struct gomp_thread *create_non_tls_thread_data (void) +{ + struct gomp_thread *thr = gomp_malloc_cleared (sizeof (struct gomp_thread)); + pthread_setspecific (gomp_tls_key, thr); + gomp_sem_init (thr-release, 0); + + return thr; +} +#endif + static void __attribute__((destructor)) team_destructor (void) { 2014-09-02 14:36 GMT+04:00 Varvara Rainchik varvara.s.rainc...@gmail.com: May I use gomp_free_thread as a destructor for pthread_key_create? Then I'll make initial_thread_tls_data global for the first case, but how can I differentiate thread created by gomp_thread_start (second case)? 2014-09-01 14:51 GMT+04:00 Jakub Jelinek ja...@redhat.com: On Fri, Aug 29, 2014 at 10:40:57AM -0700, Richard Henderson wrote: On 08/06/2014 03:05 AM, Varvara Rainchik wrote: * libgomp.h (gomp_thread): For non TLS case create thread data. * team.c
Re: [PATCH, i386, Pointer Bounds Checker 32/x] Pointer Bounds Checker hooks for i386 target
On 18 Sep 21:09, Jeff Law wrote: On 09/18/14 13:34, Uros Bizjak wrote: 2014-06-11 18:00 GMT+04:00 Ilya Enkovich enkovich@gmail.com: Hi, This patch adds i386 target hooks for Pointer Bounds Checker. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-06-11 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c: Include tree-iterator.h. (ix86_function_value_bounds): New. (ix86_builtin_mpx_function): New. (ix86_load_bounds): New. (ix86_store_bounds): New. (ix86_load_returned_bounds): New. (ix86_store_returned_bounds): New. (ix86_fn_abi_va_list_bounds_size): New. (ix86_mpx_bound_mode): New. (ix86_make_bounds_constant): New. (ix86_initialize_bounds): (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. I have comments from the implementation side, but IMO Jeff (CC'd) should give the final approval on the functionality and general approach of the patch. I was not able to follow the meaning and logic behind SLOT (which may be register ?), PTR, TO, and special BND addresses. Most, if not all of the general principle has been approved, including the creation of the target hooks that Ilya wants to exploit in the x86 backend. Given that we've got two folks (Uros myself) that have really struggled wrapping our heads around the docs for the new hooks, particularly those related to passing parameters their bounds, I'd like Ilya to make another attempt to clarify the docs around those hooks. I recall I worked on documentation of target hooks used to load/store bounds and believe I made it much cleaner. I didn't duplicate new descriptions for i386 implementations though. Thus I suppose Uros read old less informative descriptions in i386.c rather than better ones in tm.texi. I copy more informative descriptions now to i386.c and hope it will be more clear what these functions do. Thanks, Ilya Uros, if you could go ahead and give your implementation side comments, it'd be appreciated. I wouldn't expect major changes to the implementation to occur as a result of further iteration on the docs for the hooks. Thanks, Jeff
Re: [PATCH] Do not allow non-allocatable registers in scratch_operand
On Thu, Sep 18, 2014 at 11:54:46PM -0600, Jeff Law wrote: Shouldn't you be testing if the register is fixed rather than its class? Or maybe both? register_operand (via general_operand) uses operand_reg_set for this; it is initialised via the regclass NO_REGS too (and other things). This would work for us (rs6000) too, or indeed fixed_regs[], or even wider classes. I have no idea if it would hurt other targets though, and I have no desire to test all weirdo targets ;-) If another regclass reg is assigned to a match_scratch (and then via a splitter to a match_operand register), reload can fix it up. It cannot fix up things properly for regs of NO_REGS class. This is what this patch is for: a class NO_REGS reg cannot ever work as a match_scratch. I could test for a wider class if you want, but I'll need some guidance what to test for exactly then; existing code isn't exactly consistent. Segher
Re: [PATCH][PING] Keep patch file permissions in mklog
On Fri, Sep 19, 2014 at 12:41:53PM +0200, Tom de Vries wrote: I've followed up on the explanation by Segher about 2.15 File module version and fixed the comment. I've not added the 2.15 file module version check on copy Segher also mentioned, since I'm not sure about that one. AFAIU the tradeoff is: - without the check the mklog still works ok for older versions of perl, apart from the permissions (the situation we're in for all versions of perl without the patch) Sounds good then. The module is called File::Copy fwiw, not File. The patch looks to me like it should work fine. Segher
Re: [PATCH, i386, Pointer Bounds Checker 32/x] Pointer Bounds Checker hooks for i386 target
On 18 Sep 21:34, Uros Bizjak wrote: 2014-06-11 18:00 GMT+04:00 Ilya Enkovich enkovich@gmail.com: Hi, This patch adds i386 target hooks for Pointer Bounds Checker. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-06-11 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c: Include tree-iterator.h. (ix86_function_value_bounds): New. (ix86_builtin_mpx_function): New. (ix86_load_bounds): New. (ix86_store_bounds): New. (ix86_load_returned_bounds): New. (ix86_store_returned_bounds): New. (ix86_fn_abi_va_list_bounds_size): New. (ix86_mpx_bound_mode): New. (ix86_make_bounds_constant): New. (ix86_initialize_bounds): (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. I have comments from the implementation side, but IMO Jeff (CC'd) should give the final approval on the functionality and general approach of the patch. I was not able to follow the meaning and logic behind SLOT (which may be register ?), PTR, TO, and special BND addresses. Uros. + + /* Here we have the case when more than four pointers are +passed in registers. In this case we are out of bound +registers and have to use bndldx to load bound. RA, +RA - 8, etc. are used for address translation in bndldx. */ + addr = plus_constant (Pmode, arg_pointer_rtx, -(INTVAL (bnd) + 1) * 8); Magic value 8 refers to what? This code is supposed to work for 64 bit target only and 8 is a size of Pmode. I'll replace this and other cases with mode size. +} + else if (MEM_P (slot)) +{ + addr = XEXP (slot, 0); + addr = force_reg (Pmode, addr); +} + else +gcc_unreachable (); + + ptr = force_reg (Pmode, ptr); + /* If ptr was a register originally then it may have + mode other than Pmode. We need to extend in such + case because bndldx may work only with Pmode regs. */ + if (GET_MODE (ptr) != Pmode) +{ + rtx ext = gen_rtx_ZERO_EXTEND (Pmode, ptr); + ptr = gen_reg_rtx (Pmode); + emit_move_insn (ptr, ext); +} Please use ix86_zero_extend_to_Pmode instead. + reg = gen_reg_rtx (BNDmode); + emit_insn (TARGET_64BIT Check for BNDmode here, as in patch 31/x. +? gen_bnd64_ldx (reg, addr, ptr) +: gen_bnd32_ldx (reg, addr, ptr)); + + return reg; +} + +/* Store bounds BOUNDS for PTR pointer value stored in + specified ADDR. If ADDR is a register then TO identifies + which special address to use for bounds store. */ +static void +ix86_store_bounds (rtx ptr, rtx addr, rtx bounds, rtx to) +{ + if (!ptr) +{ + gcc_assert (MEM_P (addr)); + ptr = copy_to_mode_reg (Pmode, addr); copy_addr_to_reg +} + + if (!addr || REG_P (addr)) +{ + gcc_assert (CONST_INT_P (to)); + addr = plus_constant (Pmode, stack_pointer_rtx, -(INTVAL (to) + 1) * 8); What is magic number 8? Size of Pmode pointer, different between 32bit and 64bit targets? +} + else if (MEM_P (addr)) +addr = XEXP (addr, 0); + else +gcc_unreachable (); + + /* Should we also ignore integer modes of incorrect size?. */ + ptr = force_reg (Pmode, ptr); + addr = force_reg (Pmode, addr); + + /* Avoid registers which connot be used as index. */ + if (!index_register_operand (ptr, Pmode)) +{ + rtx temp = gen_reg_rtx (Pmode); + emit_move_insn (temp, ptr); + ptr = temp; +} Please merge together handling of ptr. Something like: if (ptr) { if (!index_register_operand (ptr, Pmode)) ptr = copy_addr_to_reg (ptr); } else { gcc_assert (MEM_P (addr)) ptr = copy_addr_to_reg (addr); } if (!addr || REG_P (addr) { ... } + + gcc_assert (POINTER_BOUNDS_MODE_P (GET_MODE (bounds))); + bounds = force_reg (GET_MODE (bounds), bounds); + + emit_insn (TARGET_64BIT Check BNDmode. +? gen_bnd64_stx (addr, ptr, bounds) +: gen_bnd32_stx (addr, ptr, bounds)); New version with fixes and better documentation for ix86_load_bounds and ix86_store_bounds is below. Thanks, Ilya -- 2014-09-18 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c: Include tree-iterator.h. (ix86_function_value_bounds): New. (ix86_builtin_mpx_function): New. (ix86_load_bounds): New. (ix86_store_bounds): New.
[Patch sh] Fixup use of constraints in define_split
Hi, After https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01615.html we error on the use of constraints in define_splits, define_expands and define_peephole2s. These are never looked at by the compiler, and so have no reason to be set. I expect there will be more fallout as Jan's auto-builder makes its way through ports I haven't tested, I'll fix those up as they come up. These are build failures, and the fixes are obvious, but I don't know my way around these ports, so I'd like an explicit maintainer ack. For testing, I've just checked that the build error is resolved. In principal, these are not functional changes as the constraints are not looked at. OK? Thanks, James --- 2014-09-19 James Greenhalgh james.greenha...@arm.com * config/sh/sh.md: Fix use of constraints in define_split. diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md index 56dee824f1c0674d51224a3e3b9be20bec7920cc..4bee5cac2ea35b12a7de4cbc3e11f0c71becf996 100644 --- a/gcc/config/sh/sh.md +++ b/gcc/config/sh/sh.md @@ -7709,10 +7709,10 @@ (define_insn movdf_i4 ;; use that will prevent scheduling of other stack accesses beyond this ;; instruction. (define_split - [(set (match_operand:DF 0 register_operand ) - (match_operand:DF 1 register_operand )) - (use (match_operand:PSI 2 fpscr_operand )) - (clobber (match_scratch:SI 3 =X))] + [(set (match_operand:DF 0 register_operand) + (match_operand:DF 1 register_operand)) + (use (match_operand:PSI 2 fpscr_operand)) + (clobber (match_scratch:SI 3))] (TARGET_SH4 || TARGET_SH2A_DOUBLE) reload_completed (true_regnum (operands[0]) 16) != (true_regnum (operands[1]) 16) [(const_int 0)]
[PATCH][AArch64] Fix PR63293
when generating instructions to access local variable, for example a local array, if the array size very big, then we need a temp reg to keep the intermediate index, then use that temp reg as base reg, so that ldr is capable of indexing the element. while this will cause trouble, because the introduce of temp reg break the dependence between the stack variable access and stack adjustment instructions which is unsafe when signal trampoline executed. this patch add barrier before stack adjustment in epilogue. ok for trunk? thanks. gcc/ * gcc/config/aarch64.c (aarch64_expand_epilogue): Add barrier before stack adjustment. -- Jiong diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 023f9fd..8eed9cf 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2431,6 +2431,8 @@ aarch64_expand_epilogue (bool for_sibcall) if (frame_size 0) { + emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx)); + if (frame_size = 0x100) { rtx op0 = gen_rtx_REG (Pmode, IP0_REGNUM);
[C/C++ PATCH] Handle enum bit-fields for -Wswitch (PR c/61405, PR c/53874)
This patch makes -Wswitch{,-enum} work even with enum bit-fields (we got several bugreports about that). The problem is that enum e:2; has an integral type with precision 2, and c_do_switch_warnings doesn't like that: 6195 /* From here on, we only care about about enumerated types. */ 6196 if (!type || TREE_CODE (type) != ENUMERAL_TYPE) 6197 return; (we need to look at TYPE_VALUES of the type). But there's a way how to get the enum. For C, the original type is saved in ce.original_type after c_parser_expression has done its job - and it seems viable to just pass this down to c_finish_case. Though this original type can be NULL, in that case I pass the type of cs-switch_expr to c_do_switch_warnings - the same what we did before. (Setting cs-original_type didn't pan out - the code elsewhere doesn't seem to be ready to handle that.) For C++, I think easiest would be to handle enum bit-fields specially: set SWITCH_STMT_TYPE to the enum type. The rest of the code seems to cope with this well, at least I haven't seen any regressions. With this I had to tweak several places in the compiler. E.g. DECL_FUNCTION_CODE is defined as an enum bit-field, meaning that we started to warn on switches such as switch (DECL_FUNCTION_CODE (x)). Adding default case fixes this. But we also started to warn on CPP_KEYWORD and two others: case value not in enumerated type. Fixed by moving CPP_KEYWORD, CPP_TEMPLATE_ID, and CPP_NESTED_NAME_SPECIFIER into enum cpp_ttype. Not sure if this is going to hurt something else. In lto-streamer-out.c we started to warn with may be uninitialized for some reason - seems bogus. Jason, does the C++ part look sane? Joseph, does the C part look sane? Jakub, does the asan.c change look ok? Honza, does the IPA/predict/lto changes look ok? Thanks. Bootstrapped/regtested on x86_64-linux. 2014-09-19 Marek Polacek pola...@redhat.com PR c/61405 PR c/53874 gcc/ * asan.c (maybe_instrument_call): Add default case. * ipa-pure-const.c (special_builtin_state): Likewise. * predict.c (expr_expected_value_1): Likewise. * lto-streamer-out.c (write_symbol): Initialize variable. gcc/c-family/ * c-common.h (struct c_common_resword): Don't define CPP_KEYWORD, CPP_TEMPLATE_ID, and CPP_NESTED_NAME_SPECIFIER. gcc/c/ * c-parser.c (c_parser_switch_statement): Pass original type to c_finish_case. * c-tree.h (c_finish_case): Update declaration. * c-typeck.c (c_finish_case): Add TYPE parameter. Pass it conditionally to c_do_switch_warnings. gcc/cp/ * semantics.c (finish_switch_cond): Handle enum bit-fields. * tree.c (bot_manip): Add default case. gcc/testsuite/ * c-c++-common/pr53874.c: New test. * c-c++-common/pr61405.c: New test. libcpp/ * include/cpplib.h (enum cpp_ttype): Define CPP_KEYWORD, CPP_TEMPLATE_ID, and CPP_NESTED_NAME_SPECIFIER. diff --git gcc/gcc/asan.c gcc/gcc/asan.c index 2a90d86..bca95df 100644 --- gcc/gcc/asan.c +++ gcc/gcc/asan.c @@ -2024,6 +2024,8 @@ maybe_instrument_call (gimple_stmt_iterator *iter) case BUILT_IN_TRAP: /* Don't instrument these. */ return false; + default: + break; } } tree decl = builtin_decl_implicit (BUILT_IN_ASAN_HANDLE_NO_RETURN); diff --git gcc/gcc/c-family/c-common.h gcc/gcc/c-family/c-common.h index 993a97b..c9f38f3 100644 --- gcc/gcc/c-family/c-common.h +++ gcc/gcc/c-family/c-common.h @@ -327,22 +327,6 @@ struct c_common_resword /* Extra cpp_ttype values for C++. */ -/* A token type for keywords, as opposed to ordinary identifiers. */ -#define CPP_KEYWORD ((enum cpp_ttype) (N_TTYPES + 1)) - -/* A token type for template-ids. If a template-id is processed while - parsing tentatively, it is replaced with a CPP_TEMPLATE_ID token; - the value of the CPP_TEMPLATE_ID is whatever was returned by - cp_parser_template_id. */ -#define CPP_TEMPLATE_ID ((enum cpp_ttype) (CPP_KEYWORD + 1)) - -/* A token type for nested-name-specifiers. If a - nested-name-specifier is processed while parsing tentatively, it is - replaced with a CPP_NESTED_NAME_SPECIFIER token; the value of the - CPP_NESTED_NAME_SPECIFIER is whatever was returned by - cp_parser_nested_name_specifier_opt. */ -#define CPP_NESTED_NAME_SPECIFIER ((enum cpp_ttype) (CPP_TEMPLATE_ID + 1)) - /* A token type for pre-parsed C++0x decltype. */ #define CPP_DECLTYPE ((enum cpp_ttype) (CPP_NESTED_NAME_SPECIFIER + 1)) diff --git gcc/gcc/c/c-parser.c gcc/gcc/c/c-parser.c index 3f4a92b..551fd7f 100644 --- gcc/gcc/c/c-parser.c +++ gcc/gcc/c/c-parser.c @@ -5232,7 +5232,7 @@ c_parser_switch_statement (c_parser *parser) save_break = c_break_label; c_break_label = NULL_TREE; body = c_parser_c99_block_statement (parser); - c_finish_case (body); + c_finish_case (body, ce.original_type); if (c_break_label) { location_t here =
Re: [C/C++ PATCH] Handle enum bit-fields for -Wswitch (PR c/61405, PR c/53874)
On Fri, Sep 19, 2014 at 03:29:44PM +0200, Marek Polacek wrote: Jakub, does the asan.c change look ok? --- gcc/gcc/asan.c +++ gcc/gcc/asan.c @@ -2024,6 +2024,8 @@ maybe_instrument_call (gimple_stmt_iterator *iter) case BUILT_IN_TRAP: /* Don't instrument these. */ return false; + default: + break; } } tree decl = builtin_decl_implicit (BUILT_IN_ASAN_HANDLE_NO_RETURN); This hunk is fine, separately or together with the rest. Jakub
[PATCH, trivial] Clean up code: update comments and fix spacing to conform to coding style.
This fix typos and spacing in comments for several source files. Please commit this patch if OK for trunk. Index: gcc/ira-conflicts.c === --- gcc/ira-conflicts.c(revision 215391) +++ gcc/ira-conflicts.c(working copy) @@ -60,7 +60,7 @@ static IRA_INT_TYPE **conflicts; /* Record a conflict between objects OBJ1 and OBJ2. If necessary, canonicalize the conflict by recording it for lower-order subobjects - of the corresponding allocnos. */ + of the corresponding allocnos. */ static void record_object_conflict (ira_object_t obj1, ira_object_t obj2) { Index: gcc/ChangeLog === --- gcc/ChangeLog(revision 215391) +++ gcc/ChangeLog(working copy) @@ -1,3 +1,8 @@ +2014-09-19 Felix Yang felix.y...@huawei.com + +* cfgrtl.c ira.c ira-color.c ira-conflicts ira-lives.c: +Clean up code: update comments and fix spacing to conform to coding style. + 2014-09-19 James Greenhalgh james.greenha...@arm.com * doc/md.texi (Modifiers): Consistently use read/write Index: gcc/ira-color.c === --- gcc/ira-color.c(revision 215391) +++ gcc/ira-color.c(working copy) @@ -135,7 +135,7 @@ struct allocno_color_data and all its subnodes in the tree (forest) of allocno hard register nodes (see comments above). */ int hard_regs_subnodes_start; - /* The length of the previous array. */ + /* The length of the previous array. */ int hard_regs_subnodes_num; /* Records about updating allocno hard reg costs from copies. If the allocno did not get expected hard register, these records are @@ -1864,7 +1864,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) if (best_hard_regno = 0) update_costs_from_copies (a, true, ! retry_p); ira_assert (ALLOCNO_CLASS (a) == aclass); - /* We don't need updated costs anymore: */ + /* We don't need updated costs anymore. */ ira_free_allocno_updated_costs (a); return best_hard_regno = 0; } @@ -3069,7 +3069,7 @@ color_allocnos (void) -/* Output information about the loop given by its LOOP_TREE_NODE. */ +/* Output information about the loop given by its LOOP_TREE_NODE. */ static void print_loop_title (ira_loop_tree_node_t loop_tree_node) { @@ -3205,7 +3205,7 @@ color_pass (ira_loop_tree_node_t loop_tree_node) ALLOCNO_ASSIGNED_P (subloop_allocno) = true; if (hard_regno = 0) update_costs_from_copies (subloop_allocno, true, true); -/* We don't need updated costs anymore: */ +/* We don't need updated costs anymore. */ ira_free_allocno_updated_costs (subloop_allocno); } } @@ -3249,7 +3249,7 @@ color_pass (ira_loop_tree_node_t loop_tree_node) ALLOCNO_ASSIGNED_P (subloop_allocno) = true; if (hard_regno = 0) update_costs_from_copies (subloop_allocno, true, true); - /* We don't need updated costs anymore: */ + /* We don't need updated costs anymore. */ ira_free_allocno_updated_costs (subloop_allocno); } continue; @@ -3265,7 +3265,7 @@ color_pass (ira_loop_tree_node_t loop_tree_node) ALLOCNO_ASSIGNED_P (subloop_allocno) = true; if (hard_regno = 0) update_costs_from_copies (subloop_allocno, true, true); - /* We don't need updated costs anymore: */ + /* We don't need updated costs anymore. */ ira_free_allocno_updated_costs (subloop_allocno); } } Index: gcc/ira-lives.c === --- gcc/ira-lives.c(revision 215391) +++ gcc/ira-lives.c(working copy) @@ -1323,7 +1323,7 @@ process_bb_node_lives (ira_loop_tree_node_t loop_t curr_point++; } - /* Propagate register pressure to upper loop tree nodes: */ + /* Propagate register pressure to upper loop tree nodes. */ if (loop_tree_node != ira_loop_tree_root) for (i = 0; i ira_pressure_classes_num; i++) { Index: gcc/ira.c === --- gcc/ira.c(revision 215391) +++ gcc/ira.c(working copy) @@ -1078,7 +1078,7 @@ setup_allocno_and_important_classes (void) containing a given class. If allocatable hard register set of a given class is not a subset of any corresponding set of a class from CLASSES, we use the cheapest (with load/store point of view) - class from CLASSES whose set intersects with given class set */ + class from CLASSES whose set intersects with given class set. */ static void setup_class_translate_array (enum reg_class *class_translate, int classes_num, enum reg_class *classes) @@ -1774,7 +1774,7 @@ void ira_setup_alts (rtx_insn *insn, HARD_REG_SET alts) { /* MAP nalt * nop - start of constraints for given operand and - alternative */ +
Re: [patch] avoid ICE due to NULL pointer dereference in ipa-comdats.c
Jeff Law wrote: On 09/17/14 13:59, Sebastian Pop wrote: Hi, I got an ICE while building libstdc++ of a cross compiler x86 to aarch64. I have a testcase that ICEs on current GCC trunk. I was trying to painfully reduce it with creduce, and it is still several thousand lines of c++. Frustrated that it does not reduce anymore, I decided to have a look with gdb at why the compiler was iceing: the code dereferences a NULL pointer that we get by looking up the value of a symbol in a map. Around that place, there is another pattern that first makes sure that the pointer we get from the map is non NULL: I copied that code around and that seemed to have solved the ICE. Regtested on x86-64-linux, and also checked that my aarch64 cross compilers are now building correctly libstdc++. Ok to commit? Well, the obvious question is whether or not we ought to be able to lookup the symbol in that particular map. You don't mention the other similar pattern of code which checks for a NULL return value from map.get, so I had to guess it was this one just above the one you want to change: Right. /* Get current lattice value of SYMBOL. */ val = map.get (symbol); if (val) group = *val; /* If it is bottom, there is nothing to do; do not clear AUX so we won't re-queue the symbol. */ if (group == error_mark_node) continue; I couldn't convince myself this code was correct if map.get returns a NULL pointer either! It certainly isn't obvious if just continuing either loop is correct if the symbol isn't found. I really think Jan needs to chime in here. If there is need for a testcase, I will open a bug report and attach the preprocessed file that causes the ICE. As I said it is still a pretty large file, and creduce was not able to reduce it further. Sebastian
Re: [patch] avoid ICE due to NULL pointer dereference in ipa-comdats.c
Jeff Law wrote: On 09/17/14 13:59, Sebastian Pop wrote: Hi, I got an ICE while building libstdc++ of a cross compiler x86 to aarch64. I have a testcase that ICEs on current GCC trunk. I was trying to painfully reduce it with creduce, and it is still several thousand lines of c++. Frustrated that it does not reduce anymore, I decided to have a look with gdb at why the compiler was iceing: the code dereferences a NULL pointer that we get by looking up the value of a symbol in a map. Around that place, there is another pattern that first makes sure that the pointer we get from the map is non NULL: I copied that code around and that seemed to have solved the ICE. Regtested on x86-64-linux, and also checked that my aarch64 cross compilers are now building correctly libstdc++. Ok to commit? Well, the obvious question is whether or not we ought to be able to lookup the symbol in that particular map. You don't mention the other similar pattern of code which checks for a NULL return value from map.get, so I had to guess it was this one just above the one you want to change: Right. /* Get current lattice value of SYMBOL. */ val = map.get (symbol); if (val) group = *val; /* If it is bottom, there is nothing to do; do not clear AUX so we won't re-queue the symbol. */ if (group == error_mark_node) continue; I couldn't convince myself this code was correct if map.get returns a NULL pointer either! It certainly isn't obvious if just continuing either loop is correct if the symbol isn't found. I really think Jan needs to chime in here. If there is need for a testcase, I will open a bug report and attach the preprocessed file that causes the ICE. As I said it is still a pretty large file, and creduce was not able to reduce it further. I am bit in hurry, so just quickly (will give it second look at evening). The pass is a simple dataflow propagating preferred sections. NULL means that the symbol in question was not visited yet. Assuming that all symbols are reachable, this should not happen by the end of dataflow process (well, thining about it, perhaps used attribute can trigger it, but it ought to be handled earlier). So I would like to know why the symbols survives remove_unreachable_functions... Honza Sebastian
RE: [PATCH][AArch64] Fix PR63293
Jiong Wang wrote: when generating instructions to access local variable, for example a local array, if the array size very big, then we need a temp reg to keep the intermediate index, then use that temp reg as base reg, so that ldr is capable of indexing the element. while this will cause trouble, because the introduce of temp reg break the dependence between the stack variable access and stack adjustment instructions which is unsafe when signal trampoline executed. this patch add barrier before stack adjustment in epilogue. ok for trunk? I believe you need more barriers. Ie. for all SP modifying instructions (including ldp with writeback) except for ones that just remove the outgoing arguments. You can avoid emitting barriers if alloca is not used and there are no locals in the frame (common case). Basically without that any memory access that may alias with the locals could be scheduled incorrectly. It seems odd that the scheduler does not understand this by default. Wilco
Re: [Patch sh] Fixup use of constraints in define_split
On 09/19/14 06:59, James Greenhalgh wrote: Hi, After https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01615.html we error on the use of constraints in define_splits, define_expands and define_peephole2s. These are never looked at by the compiler, and so have no reason to be set. Right. I expect there will be more fallout as Jan's auto-builder makes its way through ports I haven't tested, I'll fix those up as they come up. Yes. These are build failures, and the fixes are obvious, but I don't know my way around these ports, so I'd like an explicit maintainer ack. No problem, someone should be able to ack these very quickly since they don't affect correctness/functionality at all. For testing, I've just checked that the build error is resolved. In principal, these are not functional changes as the constraints are not looked at. OK? Thanks, James --- 2014-09-19 James Greenhalgh james.greenha...@arm.com * config/sh/sh.md: Fix use of constraints in define_split. Ok. jeff
[PATCH] Fix PR61998
This patch fixes ipa/61998, where simple testcases would crash when using -Wsuggest-final-types, by bailing out early when odr_types_ptr is NULL in ipa_devirt(). Approved by Honza on IRC. Boostrapped and tested on x86_64-unknown-linux-gnu, applied. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c2200fcdc426..0dc7b3325922 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-09-19 Markus Trippelsdorf mar...@trippelsdorf.de + + PR ipa/61998 + * ipa-devirt.c (ipa_devirt): Bail out if odr_types_ptr is NULL. + 2014-09-19 James Greenhalgh james.greenha...@arm.com * doc/md.texi (Modifiers): Consistently use read/write diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c index 0a11eb72085f..61e87e80240c 100644 --- a/gcc/ipa-devirt.c +++ b/gcc/ipa-devirt.c @@ -3952,6 +3952,9 @@ ipa_devirt (void) int nmultiple = 0, noverwritable = 0, ndevirtualized = 0, nnotdefined = 0; int nwrong = 0, nok = 0, nexternal = 0, nartificial = 0; + if (!odr_types_ptr) +return 0; + /* We can output -Wsuggest-final-methods and -Wsuggest-final-types warnings. This is implemented by setting up final_warning_records that are updated by get_polymorphic_call_targets. diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index b0f2bc80696e..747aad8b6a30 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-09-19 Markus Trippelsdorf mar...@trippelsdorf.de + + PR ipa/61998 + * g++.dg/warn/Wsuggest-final-2.C: New testcase. + 2014-09-19 Joost VandeVondele vond...@gcc.gnu.org PR fortran/63152 diff --git a/gcc/testsuite/g++.dg/warn/Wsuggest-final-2.C b/gcc/testsuite/g++.dg/warn/Wsuggest-final-2.C new file mode 100644 index ..51e466d59404 --- /dev/null +++ b/gcc/testsuite/g++.dg/warn/Wsuggest-final-2.C @@ -0,0 +1,4 @@ +// PR ipa/61998 +// { dg-do compile } +// { dg-options -O2 -Wsuggest-final-types } +int main () {} -- Markus
Re: [Patch sh] Fixup use of constraints in define_split
On 09/19/2014 02:59 PM, James Greenhalgh wrote: Hi, After https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01615.html we error on the use of constraints in define_splits, define_expands and define_peephole2s. These are never looked at by the compiler, and so have no reason to be set. I expect there will be more fallout as Jan's auto-builder makes its way through ports I haven't tested, I'll fix those up as they come up. These are build failures, and the fixes are obvious, but I don't know my way around these ports, so I'd like an explicit maintainer ack. For testing, I've just checked that the build error is resolved. In principal, these are not functional changes as the constraints are not looked at. S/390 bootstrap fails with: gcc/gcc/config/s390/s390.md:8397: operand 5 missing output reload For the branch on index instruction we have a define_insn_and_split with a single alternative forcing a split before reload. In this pattern to my understanding the constraints are not really required. However, for the new checker enhancements I would apply the following to avoid the error: diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 73ac0dc..06aaced 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -8405,7 +8405,7 @@ (pc))) (set (match_operand:GPR 4 nonimmediate_operand ) (plus:GPR (match_dup 1) (match_dup 2))) - (clobber (match_scratch:GPR 5 ))] + (clobber (match_scratch:GPR 5 =d))] TARGET_CPU_ZARCH # !reload_completed !reload_in_progress -Andreas-
Re: [C/C++ PATCH] Handle enum bit-fields for -Wswitch (PR c/61405, PR c/53874)
On 09/19/2014 09:29 AM, Marek Polacek wrote: - orig_type = TREE_TYPE (cond); + /* Handle enum bit-fields. */ + tree field; + if (TREE_CODE (cond) == COMPONENT_REF + (field = TREE_OPERAND (cond, 1)) + DECL_BIT_FIELD_TYPE (field) + TREE_CODE (DECL_BIT_FIELD_TYPE (field)) == ENUMERAL_TYPE) + orig_type = DECL_BIT_FIELD_TYPE (field); + else + orig_type = TREE_TYPE (cond); You want unlowered_expr_type here. Jason
libiberty patch committed: Fix writing ELF files with many sections
This patch to libiberty fixes writing out ELF files that have a large number of sections (more than 0xFF00). We already handled reading those files correctly; this fixes writing them, for absurdly large programs when using LTO. Bootstrapped and ran testsuite on x86_64-unknown-linux-gnu. This patch was tested on an absurdly large program by Honza. Committed to mainline. Ian 2014-09-19 Ian Lance Taylor i...@google.com * simple-object-elf.c (simple_object_elf_write_ehdr): Correctly handle objects with more than SHN_LORESERVE sections. (simple_object_elf_write_shdr): Add sh_link parameter. (simple_object_elf_write_to_file): Correctly handle objects with more than SHN_LORESERVE sections. Index: simple-object-elf.c === --- simple-object-elf.c (revision 215393) +++ simple-object-elf.c (working copy) @@ -698,6 +698,7 @@ simple_object_elf_write_ehdr (simple_obj unsigned char buf[sizeof (Elf64_External_Ehdr)]; simple_object_write_section *section; unsigned int shnum; + unsigned int shstrndx; fns = attrs-type_functions; cl = attrs-ei_class; @@ -743,9 +744,17 @@ simple_object_elf_write_ehdr (simple_obj (cl == ELFCLASS32 ? sizeof (Elf32_External_Shdr) : sizeof (Elf64_External_Shdr))); - ELF_SET_FIELD (fns, cl, Ehdr, buf, e_shnum, Elf_Half, shnum); - ELF_SET_FIELD (fns, cl, Ehdr, buf, e_shstrndx, Elf_Half, - shnum == 0 ? 0 : shnum - 1); + ELF_SET_FIELD (fns, cl, Ehdr, buf, e_shnum, Elf_Half, + shnum = SHN_LORESERVE ? 0 : shnum); + if (shnum == 0) +shstrndx = 0; + else +{ + shstrndx = shnum - 1; + if (shstrndx = SHN_LORESERVE) + shstrndx = SHN_XINDEX; +} + ELF_SET_FIELD (fns, cl, Ehdr, buf, e_shstrndx, Elf_Half, shstrndx); return simple_object_internal_write (descriptor, 0, buf, ehdr_size, errmsg, err); @@ -758,8 +767,8 @@ simple_object_elf_write_shdr (simple_obj off_t offset, unsigned int sh_name, unsigned int sh_type, unsigned int sh_flags, unsigned int sh_offset, unsigned int sh_size, - unsigned int sh_addralign, const char **errmsg, - int *err) + unsigned int sh_link, unsigned int sh_addralign, + const char **errmsg, int *err) { struct simple_object_elf_attributes *attrs = (struct simple_object_elf_attributes *) sobj-data; @@ -781,7 +790,7 @@ simple_object_elf_write_shdr (simple_obj ELF_SET_FIELD (fns, cl, Shdr, buf, sh_flags, Elf_Addr, sh_flags); ELF_SET_FIELD (fns, cl, Shdr, buf, sh_offset, Elf_Addr, sh_offset); ELF_SET_FIELD (fns, cl, Shdr, buf, sh_size, Elf_Addr, sh_size); - /* sh_link left as zero. */ + ELF_SET_FIELD (fns, cl, Shdr, buf, sh_link, Elf_Word, sh_link); /* sh_info left as zero. */ ELF_SET_FIELD (fns, cl, Shdr, buf, sh_addralign, Elf_Addr, sh_addralign); /* sh_entsize left as zero. */ @@ -812,6 +821,8 @@ simple_object_elf_write_to_file (simple_ unsigned int shnum; size_t shdr_offset; size_t sh_offset; + unsigned int first_sh_size; + unsigned int first_sh_link; size_t sh_name; unsigned char zero; @@ -842,8 +853,17 @@ simple_object_elf_write_to_file (simple_ shdr_offset = ehdr_size; sh_offset = shdr_offset + shnum * shdr_size; + if (shnum SHN_LORESERVE) +first_sh_size = 0; + else +first_sh_size = shnum; + if (shnum - 1 SHN_LORESERVE) +first_sh_link = 0; + else +first_sh_link = shnum - 1; if (!simple_object_elf_write_shdr (sobj, descriptor, shdr_offset, - 0, 0, 0, 0, 0, 0, errmsg, err)) + 0, 0, 0, 0, first_sh_size, first_sh_link, + 0, errmsg, err)) return errmsg; shdr_offset += shdr_size; @@ -887,7 +907,7 @@ simple_object_elf_write_to_file (simple_ if (!simple_object_elf_write_shdr (sobj, descriptor, shdr_offset, sh_name, SHT_PROGBITS, 0, sh_offset, - sh_size, 1U section-align, + sh_size, 0, 1U section-align, errmsg, err)) return errmsg; @@ -898,7 +918,7 @@ simple_object_elf_write_to_file (simple_ if (!simple_object_elf_write_shdr (sobj, descriptor, shdr_offset, sh_name, SHT_STRTAB, 0, sh_offset, - sh_name + strlen (.shstrtab) + 1, + sh_name + strlen (.shstrtab) + 1, 0, 1, errmsg, err)) return errmsg;
Re: [patch] avoid ICE due to NULL pointer dereference in ipa-comdats.c
On 09/19/14 08:11, Sebastian Pop wrote: Jeff Law wrote: On 09/17/14 13:59, Sebastian Pop wrote: Hi, I got an ICE while building libstdc++ of a cross compiler x86 to aarch64. I have a testcase that ICEs on current GCC trunk. I was trying to painfully reduce it with creduce, and it is still several thousand lines of c++. Frustrated that it does not reduce anymore, I decided to have a look with gdb at why the compiler was iceing: the code dereferences a NULL pointer that we get by looking up the value of a symbol in a map. Around that place, there is another pattern that first makes sure that the pointer we get from the map is non NULL: I copied that code around and that seemed to have solved the ICE. Regtested on x86-64-linux, and also checked that my aarch64 cross compilers are now building correctly libstdc++. Ok to commit? Well, the obvious question is whether or not we ought to be able to lookup the symbol in that particular map. You don't mention the other similar pattern of code which checks for a NULL return value from map.get, so I had to guess it was this one just above the one you want to change: Right. /* Get current lattice value of SYMBOL. */ val = map.get (symbol); if (val) group = *val; /* If it is bottom, there is nothing to do; do not clear AUX so we won't re-queue the symbol. */ if (group == error_mark_node) continue; I couldn't convince myself this code was correct if map.get returns a NULL pointer either! It certainly isn't obvious if just continuing either loop is correct if the symbol isn't found. I really think Jan needs to chime in here. If there is need for a testcase, I will open a bug report and attach the preprocessed file that causes the ICE. As I said it is still a pretty large file, and creduce was not able to reduce it further. Certainly attach it to a BZ -- from Jan's message it sounds like a NULL in that spot shouldn't be happening. So I suspect he'll want to take a deeper look. jeff
Re: [patch] avoid ICE due to NULL pointer dereference in ipa-comdats.c
Jan Hubicka wrote: I am bit in hurry, so just quickly (will give it second look at evening). The pass is a simple dataflow propagating preferred sections. NULL means that the symbol in question was not visited yet. Assuming that all symbols are reachable, this should not happen by the end of dataflow process (well, thining about it, perhaps used attribute can trigger it, but it ought to be handled earlier). So I would like to know why the symbols survives remove_unreachable_functions... https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63313 Let me know if you cannot reproduce the ICE. Thanks, Sebastian
Re: [PATCH, rs6000] Rename GCC version in warning messages
On Thu, Sep 18, 2014 at 12:50 PM, Ulrich Weigand uweig...@de.ibm.com wrote: Hello, the ABI warning messages I introduced in recent patches refer to a GCC version 4.10. As GCC has since adopted a new version naming scheme, this patch updates those messages to refer to GCC 5 instead. Tested on powerpc64le-linux. OK for mainline? Bye, Ulrich ChangeLog: * config/rs6000/rs6000.c (rs6000_special_adjust_field_align_p): Update GCC version name to GCC 5. (rs6000_function_arg_boundary): Likewise. (rs6000_function_arg): Likewise. Okay. Thanks, David
[debug-early] Allow checking of DECL_ABSTRACT in decl_ultimate_origin
Michael, I really don't understand the need for this change in your original patch. I don't know if this was a temporary testing change or what. For instance, you had the following commented out in your original patch: if (/*DECL_ABSTRACT (decl) */ DECL_ABSTRACT_ORIGIN (decl) == decl) return NULL_TREE; but then you also had: tree origin = decl_ultimate_origin (decl); snip snip - if (origin != NULL) + if (origin != NULL origin != decl) It seems to me that if origin is not null, then origin != decl will always be true, since with your change, decl_ultimate_origin() will always return NULL when DECL_ABSTRACT_ORIGIN (decl) == decl. I suppose your /*DECL_ABSTRACT (decl) */ change could be used in other uses of decl_ultimate_origin() that don't have this origin != decl business. Anyway... I don't understand the need for it (actually, to be honest, I don't understand what you were trying to do). I'm going to remove it. If you feel strongly about it, please chime in, and perhaps provide an explanation so I can include a comment. I'm happy to report that with this and the last set of patches, both C and C++ guality tests have = regressions than mainline. Yay. Committing to mainline. Thanks. Aldy commit 7254ec0893f9b56a0ba78de9eab8895c9582c24b Author: Aldy Hernandez al...@redhat.com Date: Thu Sep 18 16:54:39 2014 -0600 * dwarf2out.c (decl_ultimate_origin): Allow checking of DECL_ABSTRACT. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index f6c7f4a..68b4650 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -3683,7 +3683,7 @@ decl_ultimate_origin (const_tree decl) /* output_inline_function sets DECL_ABSTRACT_ORIGIN for all the nodes in the function to point to themselves; ignore that if we're trying to output the abstract instance of this function. */ - if (/*DECL_ABSTRACT (decl) */ DECL_ABSTRACT_ORIGIN (decl) == decl) + if (DECL_ABSTRACT (decl) DECL_ABSTRACT_ORIGIN (decl) == decl) return NULL_TREE; /* Since the DECL_ABSTRACT_ORIGIN for a DECL is supposed to be the
[C++ Patch/RFC] PR 62219
Hi, I had a quick look at this issue and it immediately reminded me c++/60605, which Jason fixed by handling local functions at the beginning of check_default_tmpl_args. In the present case of: template class = void struct S { friend void foo( S ) { [](){}; } }; it occurs to me that we may want to simply handle the operator() of the lambda in the same way?!? Anyway, the below passes testing... Thanks! Paolo. / Index: cp/pt.c === --- cp/pt.c (revision 215393) +++ cp/pt.c (working copy) @@ -4450,7 +4450,8 @@ check_default_tmpl_args (tree decl, tree parms, bo class template. */ if (TREE_CODE (CP_DECL_CONTEXT (decl)) == FUNCTION_DECL - || (TREE_CODE (decl) == FUNCTION_DECL DECL_LOCAL_FUNCTION_P (decl))) + || (TREE_CODE (decl) == FUNCTION_DECL + (DECL_LOCAL_FUNCTION_P (decl) || LAMBDA_FUNCTION_P (decl /* You can't have a function template declaration in a local scope, nor you can you define a member of a class template in a local scope. */ Index: testsuite/g++.dg/cpp0x/lambda/lambda-template14.C === --- testsuite/g++.dg/cpp0x/lambda/lambda-template14.C (revision 0) +++ testsuite/g++.dg/cpp0x/lambda/lambda-template14.C (working copy) @@ -0,0 +1,11 @@ +// PR c++/62219 +// { dg-do compile { target c++11 } } + +template class = void +struct S +{ + friend void foo( S ) + { +[](){}; + } +};
Re: [PATCH, rs6000] Improve power8 fusion peepholes
On Wed, Sep 17, 2014 at 12:47 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch is an intermediate step of what I want to do to improve power8 fusion. In the current trunk, the fusion support for gpr loads is done by a peephole2 to find the addis followed by the load instruction where the only consumer of the addis instruction is the load, and it rewrites the addis to use the register that will be loaded, and emits the two separate instructions. There is then a normal peephole that recognizes the addis/load combination, and makes sure they are emited together, along with a comment, to make tracking of the fusion attempts easier. The problem is things like the second scheduler pass will move things around, and often times move the addis away from the load. This means the normal peephole pass won't see the two instructions. This patch creates a new insn that combines the two parts, so that the scheduler2 pass won't split up the two insns. In doing static analysis, a lot more fused pairs are generated. For instance, 400.perlbench generates more than 11,300 more load fusion with these patches, 403.gcc generates 23,000 more load fusions, and 416.gamess generates 39,000 more load fusions. However, when spec 2006 is run on a power8, you don't actually see much of a performance difference with these patches. In digging into it, the main place where fusion occurs is in referencing static/global variables. The spec 2006 suite does not tend to have that much static/global data, so the linker optimizes most of the addis instructions to be nops, and the load index register is adjusted to use r2. These optimizations should help much larger code bases that do have a lot more static/global data. I've done bootstraps on both a big endian power7 and a little endian power8 with no regressions. Are these patches ok to install in the trunk, and the 4.8/4.9 branches? 2014-09-16 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/predicates.md (fusion_gpr_mem_load): Move testing for base_reg_operand to be common between LO_SUM and PLUS. (fusion_gpr_mem_combo): New predicate to match a fused address that combines the addis and memory offset address. * config/rs6000/rs6000-protos.h (fusion_gpr_load_p): Change calling signature. (emit_fusion_gpr_load): Likewise. * config/rs6000/rs6000.c (fusion_gpr_load_p): Change calling signature to pass each argument separately, rather than using an operands array. Rewrite the insns found by peephole2 to be a single insn, rather than hoping the insns will still be together when the peephole pass is done. Drop being called via a normal peephole. (emit_fusion_gpr_load): Change calling signature to be called from the fusion_gpr_load_mode insns with a combined memory address instead of the peephole pass passing the addis and offset separately. * config/rs6000/rs6000.md (UNSPEC_FUSION_GPR): New unspec for GPR fusion. (power8 fusion peephole): Drop support for doing power8 via a normal peephole that was created by the peephole2 pass. (power8 fusion peephole2): Create a new insn with the fused address, so that the fused operation is kept together after register allocation is done. (fusion_gpr_load_mode): Likewise. Okay. Thanks, David
[PATCH] PR LTO/63270 - new test
Hello. The following patch adds a test for PR/63270. The test has been tested before and with r215328, where the bug was fixed. Thank you, Martin gcc/testsuite/ChangeLog: 2014-09-19 Martin Liska mli...@suse.cz PR LTO/63270 * g++.dg/lto/pr63270_0.C: New test. * g++.dg/lto/pr63270_1.C: New test. diff --git a/gcc/testsuite/g++.dg/lto/pr63270_0.C b/gcc/testsuite/g++.dg/lto/pr63270_0.C new file mode 100644 index 000..98f2735 --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr63270_0.C @@ -0,0 +1,72 @@ +// { dg-lto-do link } +// { dg-lto-options {{-flto -O2 -Wno-odr}} } +typedef unsigned long uintptr_t; +namespace v8 { +class Extension; +namespace internal { +class A { +public: + A(int); +}; +class B { +public: + B(int); +}; +class Scanner; +class FuncNameInferrer; +template typename Traits class ParserBase : Traits { + class FunctionState; + bool parenthesized_function_; + typename Traits::Type::Scope *scope_; + FunctionState *function_state_; + v8::Extension *extension_; + FuncNameInferrer *fni_; + Scanner *scanner_; + uintptr_t stack_limit_; + bool stack_overflow_; + bool allow_lazy_; + bool allow_natives_syntax_; + bool allow_generators_; + bool allow_for_of_; + typename Traits::Type::Zone *zone_; +}; +class PreParserScope; +class F; +class PreParserTraits { +public: + struct Type { +typedef PreParserScope Scope; +typedef void Zone; + }; + +private: + F *pre_parser_; +}; +class F : ParserBasePreParserTraits {}; +class C { +public: + struct Type { +typedef v8::internal::FuncNameInferrer Scope; +typedef int Zone; + }; +}; +class G : ParserBaseC { +public: + static int m_fn1(); + F reusable_preparser_; +}; +class D { +public: + D(int) : function_(0), context_(0), nested_scope_chain_(0) { G::m_fn1(); } + B function_; + B context_; + A nested_scope_chain_; +}; +void fn1() { D(0); } +} +} + +int main() +{ +return 0; +} diff --git a/gcc/testsuite/g++.dg/lto/pr63270_1.C b/gcc/testsuite/g++.dg/lto/pr63270_1.C new file mode 100644 index 000..a842e5c --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr63270_1.C @@ -0,0 +1,53 @@ +typedef unsigned long uintptr_t; +namespace v8 +{ + +int kPointerSize = 0; + + class Extension; + namespace internal + { +class Token; +class Scanner; +int kCodeOffset = 0; +int kOptimizedCodeMapOffset = 0; +int kScopeInfoOffset = 0; + +class FuncNameInferrer; + template typename Traits class ParserBase:Traits +{ + class FunctionState; + bool parenthesized_function_; + typename Traits::Type::Scope * scope_; + FunctionState *function_state_; +v8::Extension * extension_; + FuncNameInferrer *fni_; + Scanner *scanner_; + uintptr_t stack_limit_; + bool stack_overflow_; + bool allow_lazy_; + bool allow_natives_syntax_; + bool allow_generators_; + bool allow_for_of_; + typename Traits::Type::Zone * zone_; +}; +class PreParserScope; +class PreParser; +class PreParserTraits +{ +public:struct Type + { + typedef PreParserScope Scope; + typedef void Zone; + }; + PreParser *pre_parser_; +}; +class PreParser:ParserBase PreParserTraits +{ + int ParseMemberWithNewPrefixesExpression ( bool * ); +}; +int PreParser::ParseMemberWithNewPrefixesExpression ( bool * ) +{ +} + } +}
Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge
On 09/19/14 05:27, Jiong Wang wrote: On 19/09/14 06:51, Jeff Law wrote: On 09/16/14 00:55, Zhenqiang Chen wrote: -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Jiong Wang Sent: Monday, September 15, 2014 11:28 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org; Jeff Law Subject: Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge On 08/05/14 09:07, Zhenqiang Chen wrote: static bool move_insn_for_shrink_wrap (basic_block bb, rtx insn, const HARD_REG_SET uses, const HARD_REG_SET defs, - HARD_REG_SET *last_uses) + HARD_REG_SET *last_uses, + bool *split_p) { rtx set, src, dest; bitmap live_out, live_in, bb_uses, bb_defs; unsigned int i, dregno, end_dregno, sregno, end_sregno; basic_block next_block; + edge live_edge; /* Look for a simple register copy. */ set = single_set (insn); @@ -5582,17 +5589,31 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, || overlaps_hard_reg_set_p (defs, GET_MODE (dest), dregno)) return false; - /* See whether there is a successor block to which we could move INSN. */ - next_block = next_block_for_reg (bb, dregno, end_dregno); - if (!next_block) + live_edge = next_block_for_reg (bb, dregno, end_dregno); if + (!live_edge) return false; + next_block = live_edge-dest; + /* If the destination register is referred in later insn, try to forward it. */ if (overlaps_hard_reg_set_p (*last_uses, GET_MODE (dest), dregno) !try_copy_prop (bb, insn, src, dest, last_uses)) return false; + /* Create a new basic block on the edge. */ if (EDGE_COUNT + (next_block-preds) == 2) +{ + next_block = split_edge (live_edge); + + bitmap_copy (df_get_live_in (next_block), df_get_live_out + (bb)); (re-sent, looks like the first send not received by the server...) for the function _IO_wdefault_xsputn in glibc wgenops.c (.dot file attached) 174: NOTE_INSN_BASIC_BLOCK 21 are the new created basic block because of the sink of move instruction ax:DI = 0 and split edge. but the live_in of this BB is copied from live_out of BB 2 which is too big, and actually prevent the later sink of 16: r12:DI=dx:DI. Should it be better to copy live_in from next_block-next_bb instead of live_out from bb, as it will model what's needed more accurately? According to the algorithm, next_block-next_bb (which is live_edge-dest) should have two predecessors. Live_in of next_block-next_bb would include live_out of the other edge, which is not necessary accurately. To be more accurate, you may need an intersection of df_get_live_out (bb) and df_get_live_in (next_block-next_bb). Presumably we can't recompute the liveness info here as it's too expensive to do that each time we split an edge to sink a statement? ie, we need the more accurate liveness within this pass so that we can sink further instructions? for a single case, x86 bootstrap, looks like these code are not compile time critical as for all 4762 live edges which could sink instructions 4139: EDGE_COUNT (next_block-preds) == 1 270: EDGE_COUNT (next_block-preds) == 2 353: EDGE_COUNT (next_block-preds) 2 and if we make the live info more accurate here, just very few additional functions ( 10) shrink-wrapped from glibc build gcc bootstrap test. Perhaps I wasn't clear. The whole point behind the proposed changes is to enable additional sinking that is currently blocked because of the overly large set of live objects. So my question was an attempt to understand why we didn't just a normal liveness update -- I speculated that we aren't doing a normal liveness update because of the compile-time cost. So, is it OK we simply change bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); into bitmap_and (df_get_live_in (next_block), df_get_live_out (bb), df_get_live_in (next_block-next_bb)); ? Probably. Though I'd be a bit concerned with next_block-next_bb. Wouldn't it be safer to stash away the relevant basic block prior to the call to split_edge, then use that saved copy. Something like this (untested): basic_block old_dest = live_edge-dest; next_block = split_edge (live_edge); /* We create a new basic block. Call df_grow_bb_info to make sure all data structures are allocated. */ df_grow_bb_info (df_live); bitmap_and (df_get_live_in (next_block), df_get_live_out (bb), df_get_live_in (old_dest)); The idea being we don't want to depend on the precise ordering blocks in the block chain. Could you try that and see if it does what you need? jeff
Re: [PATCH 0/5] Fix handling of word subregs of wide registers
On 09/19/14 01:23, Richard Sandiford wrote: Jeff Law l...@redhat.com writes: On 09/18/14 04:07, Richard Sandiford wrote: This series is a cleaned-up version of: https://gcc.gnu.org/ml/gcc/2014-03/msg00163.html The underlying problem is that the semantics of subregs depend on the word size. You can't have a subreg for byte 2 of a 4-byte word, say, but you can have a subreg for word 2 of a 4-word value (as well as lowpart subregs of that word, etc.). This causes problems when an architecture has wider-than-word registers, since the addressability of a word can then depend on which register class is used. The register allocators need to fix up cases where a subreg turns out to be invalid for a particular class. This is really an extension of what we need to do for CANNOT_CHANGE_MODE_CLASS. Tested on x86_64-linux-gnu, powerpc64-linux-gnu and aarch64_be-elf. I thought we fixed these problems long ago with the change to subreg_byte?!? No, that was fixing something else. (I'm just about old enough to remember that too!) The problem here is that (say): (subreg:SI (reg:DI X) 4) is independently addressable on little-endian AArch32 if X assigned to a GPR, but not if X is assigned to a vector register. We need to allow these kinds of subreg on pseudos in order to decompose multiword arithmetic. It's then up to the RA to realise that a reload would be needed if X were assigned to a vector register, since the upper half of a vector register cannot be independently accessed. Note that you could write this example even with the old word-style offsets and IIRC the effect would have been the same. Maybe I've got some kind of mental block here. I'll go back and read the manual first and see if that helps :-) jeff
Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge
On 19/09/14 16:49, Jeff Law wrote: On 09/19/14 05:27, Jiong Wang wrote: On 19/09/14 06:51, Jeff Law wrote: On 09/16/14 00:55, Zhenqiang Chen wrote: -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Jiong Wang Sent: Monday, September 15, 2014 11:28 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org; Jeff Law Subject: Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge On 08/05/14 09:07, Zhenqiang Chen wrote: static bool move_insn_for_shrink_wrap (basic_block bb, rtx insn, const HARD_REG_SET uses, const HARD_REG_SET defs, - HARD_REG_SET *last_uses) + HARD_REG_SET *last_uses, + bool *split_p) { rtx set, src, dest; bitmap live_out, live_in, bb_uses, bb_defs; unsigned int i, dregno, end_dregno, sregno, end_sregno; basic_block next_block; + edge live_edge; /* Look for a simple register copy. */ set = single_set (insn); @@ -5582,17 +5589,31 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, || overlaps_hard_reg_set_p (defs, GET_MODE (dest), dregno)) return false; - /* See whether there is a successor block to which we could move INSN. */ - next_block = next_block_for_reg (bb, dregno, end_dregno); - if (!next_block) + live_edge = next_block_for_reg (bb, dregno, end_dregno); if + (!live_edge) return false; + next_block = live_edge-dest; + /* If the destination register is referred in later insn, try to forward it. */ if (overlaps_hard_reg_set_p (*last_uses, GET_MODE (dest), dregno) !try_copy_prop (bb, insn, src, dest, last_uses)) return false; + /* Create a new basic block on the edge. */ if (EDGE_COUNT + (next_block-preds) == 2) +{ + next_block = split_edge (live_edge); + + bitmap_copy (df_get_live_in (next_block), df_get_live_out + (bb)); (re-sent, looks like the first send not received by the server...) for the function _IO_wdefault_xsputn in glibc wgenops.c (.dot file attached) 174: NOTE_INSN_BASIC_BLOCK 21 are the new created basic block because of the sink of move instruction ax:DI = 0 and split edge. but the live_in of this BB is copied from live_out of BB 2 which is too big, and actually prevent the later sink of 16: r12:DI=dx:DI. Should it be better to copy live_in from next_block-next_bb instead of live_out from bb, as it will model what's needed more accurately? According to the algorithm, next_block-next_bb (which is live_edge-dest) should have two predecessors. Live_in of next_block-next_bb would include live_out of the other edge, which is not necessary accurately. To be more accurate, you may need an intersection of df_get_live_out (bb) and df_get_live_in (next_block-next_bb). Presumably we can't recompute the liveness info here as it's too expensive to do that each time we split an edge to sink a statement? ie, we need the more accurate liveness within this pass so that we can sink further instructions? for a single case, x86 bootstrap, looks like these code are not compile time critical as for all 4762 live edges which could sink instructions 4139: EDGE_COUNT (next_block-preds) == 1 270: EDGE_COUNT (next_block-preds) == 2 353: EDGE_COUNT (next_block-preds) 2 and if we make the live info more accurate here, just very few additional functions ( 10) shrink-wrapped from glibc build gcc bootstrap test. Perhaps I wasn't clear. The whole point behind the proposed changes is to enable additional sinking that is currently blocked because of the overly large set of live objects. So my question was an attempt to understand why we didn't just a normal liveness update -- I speculated that we aren't doing a normal liveness update because of the compile-time cost. So, is it OK we simply change bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); into bitmap_and (df_get_live_in (next_block), df_get_live_out (bb), df_get_live_in (next_block-next_bb)); ? Probably. Though I'd be a bit concerned with next_block-next_bb. Wouldn't it be safer to stash away the relevant basic block prior to the call to split_edge, then use that saved copy. Something like this (untested): basic_block old_dest = live_edge-dest; next_block = split_edge (live_edge); /* We create a new basic block. Call df_grow_bb_info to make sure all data structures are allocated. */ df_grow_bb_info (df_live); bitmap_and (df_get_live_in (next_block), df_get_live_out (bb), df_get_live_in (old_dest)); The idea being we don't want to depend on the precise ordering blocks in the block chain. Could you try that and see if it does what you need? Jeff, Thanks, verified, it works. Regards, Jiong jeff
Re: [Patch] Teach genrecog/genoutput that scratch registers require write constraint modifiers
On Fri, Sep 19, 2014 at 04:10:34PM +0100, Andreas Krebbel wrote: On 09/19/2014 02:59 PM, James Greenhalgh wrote: Hi, After https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01615.html we error on the use of constraints in define_splits, define_expands and define_peephole2s. These are never looked at by the compiler, and so have no reason to be set. I expect there will be more fallout as Jan's auto-builder makes its way through ports I haven't tested, I'll fix those up as they come up. These are build failures, and the fixes are obvious, but I don't know my way around these ports, so I'd like an explicit maintainer ack. For testing, I've just checked that the build error is resolved. In principal, these are not functional changes as the constraints are not looked at. S/390 bootstrap fails with: gcc/gcc/config/s390/s390.md:8397: operand 5 missing output reload For the branch on index instruction we have a define_insn_and_split with a single alternative forcing a split before reload. In this pattern to my understanding the constraints are not really required. Yes, if that is the existing supported behaviour for MATCH_OPERANDs (it must be or your pattern would have been failing before), I don't want to make MATCH_SCRATCH more restrictive. The fix is a one-liner. Jeff, I presume we still want to permit this? The alternate fix would be to make this an error in both cases. Ok? Thanks, James --- 2014-09-14 James Greenhalgh james.greenha...@arm.com * genrecog.c (validate_pattern): Allow empty constraints in a match_scratch.diff --git a/gcc/genrecog.c b/gcc/genrecog.c index 0134585..c1dc8fa 100644 --- a/gcc/genrecog.c +++ b/gcc/genrecog.c @@ -461,6 +461,7 @@ validate_pattern (rtx pattern, rtx insn, rtx set, int set_code) /* If a MATCH_SCRATCH is used in a context requiring an write-only or read/write register, validate that. */ if (set_code == '=' + constraints0 constraints0 != '=' constraints0 != '+') {
Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge
On 09/19/14 10:02, Jiong Wang wrote: On 19/09/14 16:49, Jeff Law wrote: On 09/19/14 05:27, Jiong Wang wrote: On 19/09/14 06:51, Jeff Law wrote: On 09/16/14 00:55, Zhenqiang Chen wrote: -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Jiong Wang Sent: Monday, September 15, 2014 11:28 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org; Jeff Law Subject: Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge On 08/05/14 09:07, Zhenqiang Chen wrote: static bool move_insn_for_shrink_wrap (basic_block bb, rtx insn, const HARD_REG_SET uses, const HARD_REG_SET defs, - HARD_REG_SET *last_uses) + HARD_REG_SET *last_uses, + bool *split_p) { rtx set, src, dest; bitmap live_out, live_in, bb_uses, bb_defs; unsigned int i, dregno, end_dregno, sregno, end_sregno; basic_block next_block; + edge live_edge; /* Look for a simple register copy. */ set = single_set (insn); @@ -5582,17 +5589,31 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, || overlaps_hard_reg_set_p (defs, GET_MODE (dest), dregno)) return false; - /* See whether there is a successor block to which we could move INSN. */ - next_block = next_block_for_reg (bb, dregno, end_dregno); - if (!next_block) + live_edge = next_block_for_reg (bb, dregno, end_dregno); if + (!live_edge) return false; + next_block = live_edge-dest; + /* If the destination register is referred in later insn, try to forward it. */ if (overlaps_hard_reg_set_p (*last_uses, GET_MODE (dest), dregno) !try_copy_prop (bb, insn, src, dest, last_uses)) return false; + /* Create a new basic block on the edge. */ if (EDGE_COUNT + (next_block-preds) == 2) +{ + next_block = split_edge (live_edge); + + bitmap_copy (df_get_live_in (next_block), df_get_live_out + (bb)); (re-sent, looks like the first send not received by the server...) for the function _IO_wdefault_xsputn in glibc wgenops.c (.dot file attached) 174: NOTE_INSN_BASIC_BLOCK 21 are the new created basic block because of the sink of move instruction ax:DI = 0 and split edge. but the live_in of this BB is copied from live_out of BB 2 which is too big, and actually prevent the later sink of 16: r12:DI=dx:DI. Should it be better to copy live_in from next_block-next_bb instead of live_out from bb, as it will model what's needed more accurately? According to the algorithm, next_block-next_bb (which is live_edge-dest) should have two predecessors. Live_in of next_block-next_bb would include live_out of the other edge, which is not necessary accurately. To be more accurate, you may need an intersection of df_get_live_out (bb) and df_get_live_in (next_block-next_bb). Presumably we can't recompute the liveness info here as it's too expensive to do that each time we split an edge to sink a statement? ie, we need the more accurate liveness within this pass so that we can sink further instructions? for a single case, x86 bootstrap, looks like these code are not compile time critical as for all 4762 live edges which could sink instructions 4139: EDGE_COUNT (next_block-preds) == 1 270: EDGE_COUNT (next_block-preds) == 2 353: EDGE_COUNT (next_block-preds) 2 and if we make the live info more accurate here, just very few additional functions ( 10) shrink-wrapped from glibc build gcc bootstrap test. Perhaps I wasn't clear. The whole point behind the proposed changes is to enable additional sinking that is currently blocked because of the overly large set of live objects. So my question was an attempt to understand why we didn't just a normal liveness update -- I speculated that we aren't doing a normal liveness update because of the compile-time cost. So, is it OK we simply change bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); into bitmap_and (df_get_live_in (next_block), df_get_live_out (bb), df_get_live_in (next_block-next_bb)); ? Probably. Though I'd be a bit concerned with next_block-next_bb. Wouldn't it be safer to stash away the relevant basic block prior to the call to split_edge, then use that saved copy. Something like this (untested): basic_block old_dest = live_edge-dest; next_block = split_edge (live_edge); /* We create a new basic block. Call df_grow_bb_info to make sure all data structures are allocated. */ df_grow_bb_info (df_live); bitmap_and (df_get_live_in (next_block), df_get_live_out (bb), df_get_live_in (old_dest)); The idea being we don't want to depend on the precise ordering blocks in the block chain. Could you try that and see if it does what you need? Jeff, Thanks, verified, it works. Great. Can you send an updated patchkit for
Re: [Patch] Teach genrecog/genoutput that scratch registers require write constraint modifiers
On 09/19/14 10:16, James Greenhalgh wrote: On Fri, Sep 19, 2014 at 04:10:34PM +0100, Andreas Krebbel wrote: On 09/19/2014 02:59 PM, James Greenhalgh wrote: Hi, After https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01615.html we error on the use of constraints in define_splits, define_expands and define_peephole2s. These are never looked at by the compiler, and so have no reason to be set. I expect there will be more fallout as Jan's auto-builder makes its way through ports I haven't tested, I'll fix those up as they come up. These are build failures, and the fixes are obvious, but I don't know my way around these ports, so I'd like an explicit maintainer ack. For testing, I've just checked that the build error is resolved. In principal, these are not functional changes as the constraints are not looked at. S/390 bootstrap fails with: gcc/gcc/config/s390/s390.md:8397: operand 5 missing output reload For the branch on index instruction we have a define_insn_and_split with a single alternative forcing a split before reload. In this pattern to my understanding the constraints are not really required. Yes, if that is the existing supported behaviour for MATCH_OPERANDs (it must be or your pattern would have been failing before), I don't want to make MATCH_SCRATCH more restrictive. The fix is a one-liner. Jeff, I presume we still want to permit this? The alternate fix would be to make this an error in both cases. Ok? Thanks, James --- 2014-09-14 James Greenhalgh james.greenha...@arm.com * genrecog.c (validate_pattern): Allow empty constraints in a match_scratch. OK. Yea, I think we want to permit -- I have vague recollections this is documented somewhere as accept anything. jeff
Re: [PATCH] PR LTO/63270 - new test
On 09/19/14 09:48, Martin Liška wrote: Hello. The following patch adds a test for PR/63270. The test has been tested before and with r215328, where the bug was fixed. Thank you, Martin pr63270.changelog gcc/testsuite/ChangeLog: 2014-09-19 Martin Liskamli...@suse.cz PR LTO/63270 * g++.dg/lto/pr63270_0.C: New test. * g++.dg/lto/pr63270_1.C: New test. OK. Please install if you haven't done so already. Thanks, Jeff
Re: [PATCH, i386, Pointer Bounds Checker 32/x] Pointer Bounds Checker hooks for i386 target
On Fri, Sep 19, 2014 at 2:53 PM, Ilya Enkovich enkovich@gmail.com wrote: This patch adds i386 target hooks for Pointer Bounds Checker. New version with fixes and better documentation for ix86_load_bounds and ix86_store_bounds is below. +/* Expand pass uses this hook to load bounds for function parameter + PTR passed in SLOT in case its bounds are not passed in a register. + + If SLOT is a memory, then bounds are loaded as for regular pointer + loaded from memory. PTR may be NULL in case SLOT is a memory. + In such case value of PTR (if required) may be loaded from SLOT. + + If SLOT is NULL or a register then SLOT_NO is an integer constant + holding number of the target dependent special slot which should be + used to obtain bounds. + + Return loaded bounds. */ OK, I hope I understand this target-handling of SLOT_NO. Can you please clarify when SLOT is a register? I propose to write this function in the following (hopefully equivalent) way: --cut here-- { if (!slot) { gcc_assert (CONST_INT_P (slot_no)); addr = plus_constant (Pmode, arg_pointer_rtx, -(INTVAL (slot_no) + 1) * GET_MODE_SIZE (Pmode)); gcc_assert (ptr); } else if (REG_P (slot)) { gcc_assert (CONST_INT_P (slot_no)); addr = plus_constant (Pmode, arg_pointer_rtx, -(INTVAL (slot_no) + 1) * GET_MODE_SIZE (Pmode)); ptr = slot; } else if (MEM_P (slot)) { addr = XEXP (slot, 0); if (!register_operand (addr, Pmode)) addr = copy_addr_to_reg (addr); if (!ptr) ptr = copy_addr_to_reg (slot); } else gcc_unreachable (); if (!index_register_operand (ptr, Pmode)) ptr = copy_addr_to_reg (ptr); ... } --cut here-- Please add a comment in each of if/else, explaining what the code is doing. This is non-trivial to understand. + +static rtx +ix86_load_bounds (rtx slot, rtx ptr, rtx slot_no) +{ + rtx addr = NULL; + rtx reg; + + if (!ptr) +{ + gcc_assert (MEM_P (slot)); + ptr = copy_addr_to_reg (slot); +} + + if (!slot || REG_P (slot)) +{ + if (slot) + ptr = slot; + + gcc_assert (CONST_INT_P (slot_no)); + + /* Here we have the case when more than four pointers are +passed in registers. In this case we are out of bound +registers and have to use bndldx to load bound. RA, +RA - 8, etc. are used for address translation in bndldx. */ + addr = plus_constant (Pmode, arg_pointer_rtx, + -(INTVAL (slot_no) + 1) * GET_MODE_SIZE (Pmode)); +} + else if (MEM_P (slot)) +{ + addr = XEXP (slot, 0); + if (!register_operand (addr, Pmode)) + addr = copy_addr_to_reg (addr); +} + else +gcc_unreachable (); + + if (!register_operand (ptr, Pmode)) +ptr = copy_addr_to_reg (ptr); + + reg = gen_reg_rtx (BNDmode); + emit_insn (BNDmode == BND64mode +? gen_bnd64_ldx (reg, addr, ptr) +: gen_bnd32_ldx (reg, addr, ptr)); + + return reg; +} + +/* Expand pass uses this hook to store BOUNDS for call argument PTR + passed in SLOT in case BOUNDS are not passed in a register. + + If SLOT is a memory, then BOUNDS are stored as for regular pointer + stored in memory. PTR may be NULL in case SLOT is a memory. + In such case value of PTR (if required) may be loaded from SLOT. + + If SLOT is NULL or a register then SLOT_NO is an integer constant + holding number of the target dependent special slot which should be + used to store BOUNDS. */ + +static void +ix86_store_bounds (rtx ptr, rtx slot, rtx bounds, rtx slot_no) This function can be written in exactly the same way as the above proposed code, up to the check for ptr register_operand. +{ + rtx addr; + + if (ptr) +{ + if (!register_operand (ptr, Pmode)) + ptr = copy_addr_to_reg (ptr); +} + else +{ + gcc_assert (MEM_P (slot)); + ptr = copy_addr_to_reg (slot); +} + + if (!slot || REG_P (slot)) +{ + gcc_assert (CONST_INT_P (slot_no)); + addr = plus_constant (Pmode, stack_pointer_rtx, + -(INTVAL (slot_no) + 1) * GET_MODE_SIZE (Pmode)); +} + else if (MEM_P (slot)) +addr = XEXP (slot, 0); + else +gcc_unreachable (); + + if (!register_operand (addr, Pmode)) +addr = copy_addr_to_reg (addr); The above check is needed since addr handling in the gen_bndX_stx expander (patch 2/n) is different that addr handling in gen_bndX_ldx. This handling should be unified in a follow-up patch. + /* Avoid registers which connot be used as index. */ + if (!index_register_operand (ptr, Pmode)) +{ + rtx temp = gen_reg_rtx (Pmode); + emit_move_insn (temp, ptr); + ptr = temp; +} + + gcc_assert (POINTER_BOUNDS_MODE_P (GET_MODE (bounds))); + if (!register_operand (bounds, BNDmode)) +bounds =
Re: [PATCH, trivial] Clean up code: update comments and fix spacing to conform to coding style.
On 09/19/14 07:43, Felix Yang wrote: +2014-09-19 Felix Yangfelix.y...@huawei.com + +* cfgrtl.c ira.c ira-color.c ira-conflicts ira-lives.c: +Clean up code: update comments and fix spacing to conform to coding style. THanks. Committed. jeff
Re: parallel check output changes?
On Sep 19, 2014, at 2:37 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: On Thu, Sep 18, 2014 at 01:44:55PM -0500, Segher Boessenkool wrote: I am testing a patch that is just diff --git a/contrib/dg-extract-results.py b/contrib/dg-extract-results.py index cccbfd3..3781423 100644 --- a/contrib/dg-extract-results.py +++ b/contrib/dg-extract-results.py @@ -117,7 +117,7 @@ class Prog: self.tool_re = re.compile (r'^\t\t=== (.*) tests ===$') self.result_re = re.compile (r'^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED' r'|WARNING|ERROR|UNSUPPORTED|UNTESTED' - r'|KFAIL):\s*(\S+)') + r'|KFAIL):\s*(.+)') self.completed_re = re.compile (r'.* completed at (.*)') # Pieces of text to write at the head of the output. # start_line is a pair in which the first element is a datetime Tested that with four runs on powerpc64-linux, four configs each time; test-summary shows the same in all cases. Many lines have moved compared to without the patch, but that cannot be helped. Okay for mainline? Ok. I also did some timings for make -j60 -k check, same -m64,-m32,-m32/-mpowerpc64, -m64/-mlra configs. A run takes 65m, is effectively 42x parallel, and has 15% system time. Thanks for the work and for the timings.
Re: [PATCH] Do not allow non-allocatable registers in scratch_operand
On 09/19/14 06:21, Segher Boessenkool wrote: On Thu, Sep 18, 2014 at 11:54:46PM -0600, Jeff Law wrote: Shouldn't you be testing if the register is fixed rather than its class? Or maybe both? register_operand (via general_operand) uses operand_reg_set for this; it is initialised via the regclass NO_REGS too (and other things). This would work for us (rs6000) too, or indeed fixed_regs[], or even wider classes. I have no idea if it would hurt other targets though, and I have no desire to test all weirdo targets ;-) Can't blame you for that :-) Thanks for the pointer to operand_reg_set. Having walked through some of that code, I think we're OK just looking at the class like your original patch did. Concerns withdrawn. Patch approved. Jeff
Re: Fix i386 FP_TRAPPING_EXCEPTIONS
On Thu, 18 Sep 2014, Joseph S. Myers wrote: On Thu, 18 Sep 2014, Uros Bizjak wrote: OK for mainline and release branches. I've omitted ia64 from the targets in the testcase in the release branch version, given the lack of any definition of FP_TRAPPING_EXCEPTIONS at all there. (I think a definition as (~_fcw 0x3f) should work for ia64, but haven't tested that.) Here is an *untested* patch with that definition. 2014-09-19 Joseph Myers jos...@codesourcery.com PR target/63312 * config/ia64/sfp-machine.h (FE_EX_ALL, FP_TRAPPING_EXCEPTIONS): New macros. Index: libgcc/config/ia64/sfp-machine.h === --- libgcc/config/ia64/sfp-machine.h(revision 215389) +++ libgcc/config/ia64/sfp-machine.h(working copy) @@ -56,6 +56,9 @@ #define FP_EX_OVERFLOW 0x08 #define FP_EX_UNDERFLOW0x10 #define FP_EX_INEXACT 0x20 +#define FP_EX_ALL \ + (FP_EX_INVALID | FP_EX_DENORM | FP_EX_DIVZERO | FP_EX_OVERFLOW \ +| FP_EX_UNDERFLOW | FP_EX_INEXACT) #define _FP_TININESS_AFTER_ROUNDING 1 @@ -67,6 +70,8 @@ __sfp_handle_exceptions (_fex); \ } while (0); +#define FP_TRAPPING_EXCEPTIONS (~_fcw FP_EX_ALL) + #define FP_RND_NEAREST 0 #define FP_RND_ZERO0xc00L #define FP_RND_PINF0x800L -- Joseph S. Myers jos...@codesourcery.com
Re: [debug-early] Disable removal of DW_tags* while generating subprogram DIEs
On 09/18/14 13:55, Jason Merrill wrote: On 09/18/2014 03:25 PM, Aldy Hernandez wrote: This patch disables removing of DW_AT_{declaration,object_pointer,formal_parameter} tags while generating a DIE for subprograms. Now that we generate dwarf info early, we will always have an old_die the second time around. I see no need to remove the aforementioned tags, only to create them again (incorrectly in a C++ testcase I have). I think you probably still want to remove DW_AT_declaration, as you probably don't want it the second time around (and wouldn't create it again). Sounds good to me. Applied the attached patch. Thanks. commit b033487ef30cf7b81edb8cd8346f874ec78e0b20 Author: Aldy Hernandez al...@redhat.com Date: Fri Sep 19 10:46:54 2014 -0600 * dwarf2out.c (gen_subprogram_die): Remove DW_AT_declaration even if we have an old die. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 68b4650..48b1106 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18310,6 +18310,10 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) remove_AT (subr_die, DW_AT_declaration); remove_AT (subr_die, DW_AT_object_pointer); remove_child_TAG (subr_die, DW_TAG_formal_parameter); +#else + /* We don't need the DW_AT_declaration the second or third +time around anyhow. */ + remove_AT (subr_die, DW_AT_declaration); #endif } else
Re: [AArch64] Tighten predicates on SIMD shift intrinsics
On 09/11/2014 01:29 AM, James Greenhalgh wrote: +;; Predicates used by the various SIMD shift operations. These +;; fall in to 3 categories. +;; Shifts with a range 0-(bit_size - 1) (aarch64_simd_shift_imm) +;; Shifts with a range 1-bit_size (aarch64_simd_shift_imm_offset) +;; Shifts with a range 0-bit_size (aarch64_simd_shift_imm_bitsize) +(define_predicate aarch64_simd_shift_imm_qi + (and (match_code const_int) + (match_test aarch64_simd_const_bounds (op, 0, 7 The function call should be removed and this should be written as (match_test IN_RANGE (ival, 0, 7)) r~
Re: [PATCH 2/2] Add some more test cases for fentry and pg
This is the patch I committed. Should fix everyone's issues. Fix mcount test cases to only run on supported targets * gcc.dg/pg-override.c: Only run on x86 Linux. * gcc.dg/pg.c: Dito. * gcc.target/i386/fentry-override.c: Exclude for PIC. * gcc.target/i386/fentry.c: Dito. diff --git a/gcc/testsuite/gcc.dg/pg-override.c b/gcc/testsuite/gcc.dg/pg-override.c index 60be162..9b8d8fa 100644 --- a/gcc/testsuite/gcc.dg/pg-override.c +++ b/gcc/testsuite/gcc.dg/pg-override.c @@ -1,6 +1,6 @@ /* Test -fprofile override */ /* { dg-do compile } */ -/* { dg-options -fprofile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options -fprofile { target i?86-*-linux* x86_64-*-linux* } } */ /* { dg-final { scan-assembler-not mcount } } */ /* Origin: Andi Kleen */ extern void foobar(const char *); diff --git a/gcc/testsuite/gcc.dg/pg.c b/gcc/testsuite/gcc.dg/pg.c index 60be162..9b8d8fa 100644 --- a/gcc/testsuite/gcc.dg/pg.c +++ b/gcc/testsuite/gcc.dg/pg.c @@ -1,6 +1,6 @@ /* Test -fprofile override */ /* { dg-do compile } */ -/* { dg-options -fprofile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options -fprofile { target i?86-*-linux* x86_64-*-linux* } } */ /* { dg-final { scan-assembler-not mcount } } */ /* Origin: Andi Kleen */ extern void foobar(const char *); diff --git a/gcc/testsuite/gcc.target/i386/fentry-override.c b/gcc/testsuite/gcc.target/i386/fentry-override.c index 3771f19..0464454 100644 --- a/gcc/testsuite/gcc.target/i386/fentry-override.c +++ b/gcc/testsuite/gcc.target/i386/fentry-override.c @@ -1,5 +1,5 @@ /* Test -mfentry override */ -/* { dg-do compile } */ +/* { dg-do compile { target { *-*-linux* } { nonpic || ! { ia32 } } } } */ /* { dg-options -mfentry } */ /* { dg-final { scan-assembler-not __fentry__ } } */ /* Origin: Andi Kleen */ diff --git a/gcc/testsuite/gcc.target/i386/fentry.c b/gcc/testsuite/gcc.target/i386/fentry.c index bd3db13..d0d70c6 100644 --- a/gcc/testsuite/gcc.target/i386/fentry.c +++ b/gcc/testsuite/gcc.target/i386/fentry.c @@ -1,5 +1,5 @@ /* Test -mfentry */ -/* { dg-do compile } */ +/* { dg-do compile { target { *-*-linux* } { nonpic || ! { ia32 } } } } */ /* { dg-options -fprofile -mfentry } */ /* { dg-final { scan-assembler __fentry__ } } */ /* Origin: Andi Kleen */ -- a...@linux.intel.com -- Speaking for myself only.
Re: [PATCH 0/5] Fix handling of word subregs of wide registers
On 09/19/14 01:23, Richard Sandiford wrote: Jeff Law l...@redhat.com writes: On 09/18/14 04:07, Richard Sandiford wrote: This series is a cleaned-up version of: https://gcc.gnu.org/ml/gcc/2014-03/msg00163.html The underlying problem is that the semantics of subregs depend on the word size. You can't have a subreg for byte 2 of a 4-byte word, say, but you can have a subreg for word 2 of a 4-word value (as well as lowpart subregs of that word, etc.). This causes problems when an architecture has wider-than-word registers, since the addressability of a word can then depend on which register class is used. The register allocators need to fix up cases where a subreg turns out to be invalid for a particular class. This is really an extension of what we need to do for CANNOT_CHANGE_MODE_CLASS. Tested on x86_64-linux-gnu, powerpc64-linux-gnu and aarch64_be-elf. I thought we fixed these problems long ago with the change to subreg_byte?!? No, that was fixing something else. (I'm just about old enough to remember that too!) The problem here is that (say): (subreg:SI (reg:DI X) 4) is independently addressable on little-endian AArch32 if X assigned to a GPR, but not if X is assigned to a vector register. We need to allow these kinds of subreg on pseudos in order to decompose multiword arithmetic. It's then up to the RA to realise that a reload would be needed if X were assigned to a vector register, since the upper half of a vector register cannot be independently accessed. Note that you could write this example even with the old word-style offsets and IIRC the effect would have been the same. OK. So I kept thinking in terms of the byte offset stuff. But what you're tackling is related to the mess around the mode of the subreg having a different meaning if its smaller than a word vs word-sized or greater. Right? Jeff
[PATCH] AArch64: Default to -fsched-pressure
This patch makes -fsched-pressure the default on AArch64, like on ARM. This improves performance and reduces codesize due to fewer unnecessary spills. OK for commit? ChangeLog: 2014-09-19 Wilco Dijkstra wdijk...@arm.com * gcc/common/config/aarch64/aarch64-common.c: (default_options aarch_option_optimization_table): Default to -fsched-pressure. --- gcc/common/config/aarch64/aarch64-common.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/common/config/aarch64/aarch64-common.c b/gcc/common/config/aarch64/aarch64-common.c index e44b40a..63f2212 100644 --- a/gcc/common/config/aarch64/aarch64-common.c +++ b/gcc/common/config/aarch64/aarch64-common.c @@ -44,6 +44,8 @@ static const struct default_options aarch_option_optimization_table[] = { /* Enable section anchors by default at -O1 or higher. */ { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 }, +/* Enable -fsched-pressure by default when optimizing. */ +{ OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 }, /* Enable redundant extension instructions removal at -O2 and higher. */ { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 }, { OPT_LEVELS_NONE, 0, NULL, 0 } -- 1.9.1
Re: [C++ Patch/RFC] PR 62219
On 09/19/2014 11:40 AM, Paolo Carlini wrote: if (TREE_CODE (CP_DECL_CONTEXT (decl)) == FUNCTION_DECL - || (TREE_CODE (decl) == FUNCTION_DECL DECL_LOCAL_FUNCTION_P (decl))) + || (TREE_CODE (decl) == FUNCTION_DECL + (DECL_LOCAL_FUNCTION_P (decl) || LAMBDA_FUNCTION_P (decl /* You can't have a function template declaration in a local scope, nor you can you define a member of a class template in a local scope. */ Rather than adding it to this condition, let's group the new test with the other lambda test, immediately below. Jason
Re: [C/C++ PATCH] Handle enum bit-fields for -Wswitch (PR c/61405, PR c/53874)
On Fri, Sep 19, 2014 at 11:19:14AM -0400, Jason Merrill wrote: On 09/19/2014 09:29 AM, Marek Polacek wrote: - orig_type = TREE_TYPE (cond); + /* Handle enum bit-fields. */ + tree field; + if (TREE_CODE (cond) == COMPONENT_REF + (field = TREE_OPERAND (cond, 1)) + DECL_BIT_FIELD_TYPE (field) + TREE_CODE (DECL_BIT_FIELD_TYPE (field)) == ENUMERAL_TYPE) +orig_type = DECL_BIT_FIELD_TYPE (field); + else +orig_type = TREE_TYPE (cond); You want unlowered_expr_type here. Nice. So like this? 2014-09-19 Marek Polacek pola...@redhat.com PR c/61405 PR c/53874 gcc/ * asan.c (maybe_instrument_call): Add default case. * ipa-pure-const.c (special_builtin_state): Likewise. * predict.c (expr_expected_value_1): Likewise. * lto-streamer-out.c (write_symbol): Initialize variable. gcc/c-family/ * c-common.h (struct c_common_resword): Don't define CPP_KEYWORD, CPP_TEMPLATE_ID, and CPP_NESTED_NAME_SPECIFIER. gcc/c/ * c-parser.c (c_parser_switch_statement): Pass original type to c_finish_case. * c-tree.h (c_finish_case): Update declaration. * c-typeck.c (c_finish_case): Add TYPE parameter. Pass it conditionally to c_do_switch_warnings. gcc/cp/ * semantics.c (finish_switch_cond): Call unlowered_expr_type. * tree.c (bot_manip): Add default case. gcc/testsuite/ * c-c++-common/pr53874.c: New test. * c-c++-common/pr61405.c: New test. libcpp/ * include/cpplib.h (enum cpp_ttype): Define CPP_KEYWORD, CPP_TEMPLATE_ID, and CPP_NESTED_NAME_SPECIFIER. diff --git gcc/gcc/asan.c gcc/gcc/asan.c index 2a90d86..bca95df 100644 --- gcc/gcc/asan.c +++ gcc/gcc/asan.c @@ -2024,6 +2024,8 @@ maybe_instrument_call (gimple_stmt_iterator *iter) case BUILT_IN_TRAP: /* Don't instrument these. */ return false; + default: + break; } } tree decl = builtin_decl_implicit (BUILT_IN_ASAN_HANDLE_NO_RETURN); diff --git gcc/gcc/c-family/c-common.h gcc/gcc/c-family/c-common.h index 993a97b..c9f38f3 100644 --- gcc/gcc/c-family/c-common.h +++ gcc/gcc/c-family/c-common.h @@ -327,22 +327,6 @@ struct c_common_resword /* Extra cpp_ttype values for C++. */ -/* A token type for keywords, as opposed to ordinary identifiers. */ -#define CPP_KEYWORD ((enum cpp_ttype) (N_TTYPES + 1)) - -/* A token type for template-ids. If a template-id is processed while - parsing tentatively, it is replaced with a CPP_TEMPLATE_ID token; - the value of the CPP_TEMPLATE_ID is whatever was returned by - cp_parser_template_id. */ -#define CPP_TEMPLATE_ID ((enum cpp_ttype) (CPP_KEYWORD + 1)) - -/* A token type for nested-name-specifiers. If a - nested-name-specifier is processed while parsing tentatively, it is - replaced with a CPP_NESTED_NAME_SPECIFIER token; the value of the - CPP_NESTED_NAME_SPECIFIER is whatever was returned by - cp_parser_nested_name_specifier_opt. */ -#define CPP_NESTED_NAME_SPECIFIER ((enum cpp_ttype) (CPP_TEMPLATE_ID + 1)) - /* A token type for pre-parsed C++0x decltype. */ #define CPP_DECLTYPE ((enum cpp_ttype) (CPP_NESTED_NAME_SPECIFIER + 1)) diff --git gcc/gcc/c/c-parser.c gcc/gcc/c/c-parser.c index 3f4a92b..551fd7f 100644 --- gcc/gcc/c/c-parser.c +++ gcc/gcc/c/c-parser.c @@ -5232,7 +5232,7 @@ c_parser_switch_statement (c_parser *parser) save_break = c_break_label; c_break_label = NULL_TREE; body = c_parser_c99_block_statement (parser); - c_finish_case (body); + c_finish_case (body, ce.original_type); if (c_break_label) { location_t here = c_parser_peek_token (parser)-location; diff --git gcc/gcc/c/c-tree.h gcc/gcc/c/c-tree.h index 6004d50..fc145a85 100644 --- gcc/gcc/c/c-tree.h +++ gcc/gcc/c/c-tree.h @@ -618,7 +618,7 @@ extern void process_init_element (location_t, struct c_expr, bool, extern tree build_compound_literal (location_t, tree, tree, bool); extern void check_compound_literal_type (location_t, struct c_type_name *); extern tree c_start_case (location_t, location_t, tree, bool); -extern void c_finish_case (tree); +extern void c_finish_case (tree, tree); extern tree build_asm_expr (location_t, tree, tree, tree, tree, tree, bool); extern tree build_asm_stmt (tree, tree); extern int c_types_compatible_p (tree, tree); diff --git gcc/gcc/c/c-typeck.c gcc/gcc/c/c-typeck.c index da71ab2..f69c28b 100644 --- gcc/gcc/c/c-typeck.c +++ gcc/gcc/c/c-typeck.c @@ -9486,10 +9486,11 @@ do_case (location_t loc, tree low_value, tree high_value) return label; } -/* Finish the switch statement. */ +/* Finish the switch statement. TYPE is the original type of the + controlling expression of the switch, or NULL_TREE. */ void -c_finish_case (tree body) +c_finish_case (tree body, tree type) { struct c_switch *cs = c_switch_stack; location_t switch_location; @@ -9499,7 +9500,7 @@ c_finish_case
Re: [PATCH 4/5] Generalise invalid_mode_change_p
On 09/18/14 04:25, Richard Sandiford wrote: This is the main patch for the bug. We should treat a register as invalid for a mode change if simplify_subreg_regno cannot provide a new register number for the result. We should treat a class as invalid for a mode change if all registers in the class are invalid. This is an extension of the old CANNOT_CHANGE_MODE_CLASS-based check (simplify_subreg_regno checks C_C_C_M). I forgot to say that the patch is a prerequisite to removing aarch64's C_C_C_M. There are other prerequisites too, but removing C_C_C_M without this patch caused regressions in the existing testsuite, which is why no new tests are needed. gcc/ * hard-reg-set.h: Include hash-table.h. (target_hard_regs): Add a finalize method and a x_simplifiable_subregs field. * target-globals.c (target_globals::~target_globals): Handle hard_regs-finalize. * rtl.h (subreg_shape): New structure. (shape_of_subreg): New function. (simplifiable_subregs): Declare. * reginfo.c (simplifiable_subreg): New structure. (simplifiable_subregs_hasher): Likewise. (simplifiable_subregs): New function. (invalid_mode_changes): Delete. (alid_mode_changes, valid_mode_changes_obstack): New variables. (record_subregs_of_mode): Remove subregs_of_mode parameter. Record valid mode changes in valid_mode_changes. (find_subregs_of_mode): Remove subregs_of_mode parameter. Update calls to record_subregs_of_mode. (init_subregs_of_mode): Remove invalid_mode_changes and bitmap handling. Initialize new variables. Update call to find_subregs_of_mode. (invalid_mode_change_p): Check new variables instead of invalid_mode_changes. (finish_subregs_of_mode): Finalize new variables instead of invalid_mode_changes. (target_hard_regs::finalize): New function. * ira-costs.c (print_allocno_costs): Call invalid_mode_change_p even when CLASS_CANNOT_CHANGE_MODE is undefined. Index: gcc/rtl.h === --- gcc/rtl.h 2014-09-15 11:55:40.459855161 +0100 +++ gcc/rtl.h 2014-09-15 12:26:21.249077760 +0100 +/* Return the shape of a SUBREG rtx. */ + +static inline subreg_shape +shape_of_subreg (const_rtx x) +{ + return subreg_shape (GET_MODE (SUBREG_REG (x)), + SUBREG_BYTE (x), GET_MODE (x)); +} + Is there some reason you don't have a constructor that accepts a const_rtx? OK for the trunk. jeff
Re: [PATCH 5/5] Remove CANNOT_CHANGE_MODE_CLASS workaround in i386.c
On 09/18/14 04:26, Richard Sandiford wrote: Patch 4 should make it possible to relax i386'a CANNOT_CHANGE_MODE_CLASS, solving the missed optimisation that triggered the original thread. gcc/ * config/i386/i386.c (ix86_cannot_change_mode_class): Remove GET_MODE_SIZE (to) GET_MODE_SIZE (from) test. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c 2014-09-15 09:48:11.310438531 +0100 +++ gcc/config/i386/i386.c 2014-09-15 09:48:11.310438531 +0100 @@ -37526,13 +37526,6 @@ ix86_cannot_change_mode_class (enum mach the vec_dupv4hi pattern. */ if (GET_MODE_SIZE (from) 4) return true; - - /* Vector registers do not support subreg with nonzero offsets, which -are otherwise valid for integer registers. Since we can't see -whether we have a nonzero offset from here, prohibit all - nonparadoxical subregs changing size. */ - if (GET_MODE_SIZE (to) GET_MODE_SIZE (from)) - return true; } return false; OK. jeff
Re: [PATCH] -fsanitize=nonnull-attribute and -fsanitize=returns-nonnull-attribute support
On 09/09/14 11:52, Jakub Jelinek wrote: Hi! On Fri, Jun 27, 2014 at 09:13:07AM +0200, Jakub Jelinek wrote: The patch adds two new (trivial handlers) to libubsan, as it is maintained in llvm's compiler-rt, will talk to them if they are interested in those and what exact wording and form (AFAIK clang also added the gcc {,returns_}nonnull attributes). If they wouldn't be interested, guess we could add them in a separate, gcc owned, source file in ubsan (like we own Makefile*). Now that the compiler-rt bits landed up upstream, here is an updated version of the patch. First here is mostly ubsan infrastructure change so that ubsan_create_data can handle more cases, together with an improvement not to emit UBSAN_BOUNDS when it already during gimplification provably can't overflow. What the ubsan_create_data changes allow is more than one locus at the beginning and arbitrary data, not just mismatch pair, after all the typedescriptors. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-09-09 Jakub Jelinek ja...@redhat.com * ubsan.h (struct ubsan_mismatch_data): Removed. (ubsan_create_data): Remove MISMATCH argument, add LOCCNT argument. * ubsan.c (ubsan_source_location): For unknown locations, pass { NULL, 0, 0 } instead of { unknown, x, y }. (ubsan_create_data): Remove MISMATCH argument, add LOCCNT argument. Allow more than one location and arbitrary extra arguments passed in ... instead of through MISMATCH pointer. (ubsan_instrument_unreachable, ubsan_expand_bounds_ifn, ubsan_expand_null_ifn, ubsan_build_overflow_builtin, instrument_bool_enum_load, ubsan_instrument_float_cast): Adjust callers. c-family/ * c-ubsan.c (ubsan_instrument_division, ubsan_instrument_shift, ubsan_instrument_vla, ubsan_instrument_return): Adjust ubsan_create_data callers. (ubsan_instrument_bounds): Don't emit UBSAN_BOUNDS at all if index is constant or BIT_AND_EXPR with constant mask and is small enough for the bound. * c-gimplify.c (ubsan_walk_array_refs_r): For ADDR_EXPR of ARRAY_REF, make sure the inner ARRAY_REF is not walked again. OK. I really wonder if you and Marek should have a free hand in the ubsan bits. jeff
Re: [PATCH] -fsanitize=nonnull-attribute and -fsanitize=returns-nonnull-attribute support
On 09/09/14 11:53, Jakub Jelinek wrote: On Fri, Jun 27, 2014 at 09:13:07AM +0200, Jakub Jelinek wrote: The patch adds two new (trivial handlers) to libubsan, as it is maintained in llvm's compiler-rt, will talk to them if they are interested in those and what exact wording and form (AFAIK clang also added the gcc {,returns_}nonnull attributes). If they wouldn't be interested, guess we could add them in a separate, gcc owned, source file in ubsan (like we own Makefile*). And here is the actual new version of the patch, including cherry-picked libsanitizer changes. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-09-09 Jakub Jelinek ja...@redhat.com gcc/ * flag-types.h (enum sanitize_code): Add SANITIZE_NONNULL_ATTRIBUTE and SANITIZE_RETURNS_NONNULL_ATTRIBUTE, or them into SANITIZE_UNDEFINED. * opts.c (common_handle_option): Handle SANITIZE_NONNULL_ATTRIBUTE and SANITIZE_RETURNS_NONNULL_ATTRIBUTE and disable flag_delete_null_pointer_checks for them. * sanitizer.def (BUILT_IN_UBSAN_HANDLE_NONNULL_ARG, BUILT_IN_UBSAN_HANDLE_NONNULL_ARG_ABORT, BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN, BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_ABORT): New. * ubsan.c (instrument_bool_enum_load): Set *gsi back to stmt's iterator. (instrument_nonnull_arg, instrument_nonnull_return): New functions. (pass_ubsan::gate): Return true even for SANITIZE_NONNULL_ATTRIBUTE or SANITIZE_RETURNS_NONNULL_ATTRIBUTE. (pass_ubsan::execute): Call instrument_nonnull_{arg,return}. * doc/invoke.texi (-fsanitize=nonnull-attribute, -fsanitize=returns-nonnull-attribute): Document. gcc/testsuite/ * c-c++-common/ubsan/attrib-3.c: New test. * c-c++-common/ubsan/nonnull-1.c: New test. * c-c++-common/ubsan/nonnull-2.c: New test. * c-c++-common/ubsan/nonnull-3.c: New test. * c-c++-common/ubsan/nonnull-4.c: New test. * c-c++-common/ubsan/nonnull-5.c: New test. libsanitizer/ * ubsan/ubsan_handlers.cc, ubsan/ubsan_handlers.h: Cherry pick upstream r215485, r217389, r217391 and r217400. OK. jeff
Re: [C++ Patch/RFC] PR 62219
Hi, On 09/19/2014 07:32 PM, Jason Merrill wrote: On 09/19/2014 11:40 AM, Paolo Carlini wrote: if (TREE_CODE (CP_DECL_CONTEXT (decl)) == FUNCTION_DECL - || (TREE_CODE (decl) == FUNCTION_DECL DECL_LOCAL_FUNCTION_P (decl))) + || (TREE_CODE (decl) == FUNCTION_DECL + (DECL_LOCAL_FUNCTION_P (decl) || LAMBDA_FUNCTION_P (decl /* You can't have a function template declaration in a local scope, nor you can you define a member of a class template in a local scope. */ Rather than adding it to this condition, let's group the new test with the other lambda test, immediately below. Yeah, I was unsure about that. By the way, we fixed c++/60605 even in 4.8, what about 4.8 and 4.9? Thanks, Paolo. /cp 2014-09-19 Paolo Carlini paolo.carl...@oracle.com PR c++/62219 * pt.c (check_default_tmpl_args): Check LAMBDA_FUNCTION_P. /testsuite 2014-09-19 Paolo Carlini paolo.carl...@oracle.com PR c++/62219 * g++.dg/cpp0x/lambda/lambda-template14.C: New. Index: cp/pt.c === --- cp/pt.c (revision 215393) +++ cp/pt.c (working copy) @@ -4456,9 +4456,11 @@ check_default_tmpl_args (tree decl, tree parms, bo local scope. */ return true; - if (TREE_CODE (decl) == TYPE_DECL - TREE_TYPE (decl) - LAMBDA_TYPE_P (TREE_TYPE (decl))) + if ((TREE_CODE (decl) == TYPE_DECL +TREE_TYPE (decl) +LAMBDA_TYPE_P (TREE_TYPE (decl))) + || (TREE_CODE (decl) == FUNCTION_DECL + LAMBDA_FUNCTION_P (decl))) /* A lambda doesn't have an explicit declaration; don't complain about the parms of the enclosing class. */ return true; Index: testsuite/g++.dg/cpp0x/lambda/lambda-template14.C === --- testsuite/g++.dg/cpp0x/lambda/lambda-template14.C (revision 0) +++ testsuite/g++.dg/cpp0x/lambda/lambda-template14.C (working copy) @@ -0,0 +1,11 @@ +// PR c++/62219 +// { dg-do compile { target c++11 } } + +template class = void +struct S +{ + friend void foo( S ) + { +[](){}; + } +};
Go patch committed: pass constant arguments directly to thunk functions
When the Go frontend builds a thunk, for a go or defer statement, it sets up a struct to hold the arguments to the function, passes a pointer to that struct to the thunk, and the thunk then calls the function passing the fields of the struct. Something along these lines is necessary when calling a function with variable arguments. However, if the arguments are constant, then it's not necessary to store the constants in the struct; they can be encoded directly in the thunk. This patch from Chris Manghane implements this in the Go frontend. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 3e0952791c91 go/statements.cc --- a/go/statements.cc Fri Sep 05 08:06:32 2014 -0700 +++ b/go/statements.cc Fri Sep 19 11:35:43 2014 -0700 @@ -2178,7 +2178,11 @@ for (Expression_list::const_iterator p = ce-args()-begin(); p != ce-args()-end(); ++p) - vals-push_back(*p); + { + if ((*p)-is_constant()) + continue; + vals-push_back(*p); + } } // Build the struct. @@ -2281,6 +2285,9 @@ p != args-end(); ++p, ++i) { + if ((*p)-is_constant()) + continue; + char buf[50]; this-thunk_field_param(i, buf, sizeof buf); fields-push_back(Struct_field(Typed_identifier(buf, (*p)-type(), @@ -2418,21 +2425,36 @@ ++p; bool is_recover_call = ce-is_recover_call(); Expression* recover_arg = NULL; - for (; p != fields-end(); ++p, ++next_index) + + const Expression_list* args = ce-args(); + if (args != NULL) { - Expression* thunk_param = Expression::make_var_reference(named_parameter, - location); - thunk_param = Expression::make_unary(OPERATOR_MULT, thunk_param, - location); - Expression* param = Expression::make_field_reference(thunk_param, - next_index, - location); - if (!is_recover_call) - call_params-push_back(param); - else + for (Expression_list::const_iterator arg = args-begin(); + arg != args-end(); + ++arg) { - go_assert(call_params-empty()); - recover_arg = param; + Expression* param; + if ((*arg)-is_constant()) + param = *arg; + else + { + Expression* thunk_param = + Expression::make_var_reference(named_parameter, location); + thunk_param = + Expression::make_unary(OPERATOR_MULT, thunk_param, location); + param = Expression::make_field_reference(thunk_param, + next_index, + location); + ++next_index; + } + + if (!is_recover_call) + call_params-push_back(param); + else + { + go_assert(call_params-empty()); + recover_arg = param; + } } }
Fix ICE with ODR mering and variable sized types
Hi, this patch fixes ICE by avoiding mangling of types with variadic size (those are not really supported). Bootstrapped/regtested x86_64-linux, tested with libreoffice, comitted. PR lto/63286 * tree.c (need_assembler_name_p): Do not mangle variadic types. Index: tree.c === --- tree.c (revision 215328) +++ tree.c (working copy) @@ -5003,6 +5003,7 @@ need_assembler_name_p (tree decl) decl == TYPE_NAME (TREE_TYPE (decl)) !is_lang_specific (TREE_TYPE (decl)) AGGREGATE_TYPE_P (TREE_TYPE (decl)) + !variably_modified_type_p (TREE_TYPE (decl), NULL_TREE) !type_in_anonymous_namespace_p (TREE_TYPE (decl))) return !DECL_ASSEMBLER_NAME_SET_P (decl); /* Only FUNCTION_DECLs and VAR_DECLs are considered. */
Re: [patch] libstdc++/29988 Rb_Tree reuse allocated nodes
Still no feedback regarding this proposal ? On 19/08/2014 22:14, François Dumont wrote: Any news regarding this proposal ? Thanks François On 30/07/2014 23:39, François Dumont wrote: Hi Now that patch on testsuite allocator is in I would like to reactivate this one. Here it is again. See previous answer below regarding modification of _M_begin/_M_cbegin. 2014-07-30 François Dumont fdum...@gcc.gnu.org PR libstdc++/29988 * include/bits/stl_tree.h (_Rb_tree_reuse_or_alloc_node): New. (_Rb_tree_alloc_node): New. (_Rb_tree_impl): Remove unused _Is_pod_comparator template parameter. (_Rb_tree::operator=(_Rb_tree)): New. (_Rb_tree::_M_assign_unique): New. (_Rb_tree::_M_assign_equal): New. (_Rb_tree): Adapt to reuse allocated nodes as much as possible. * include/bits/stl_map.h (std::map::operator=(std::map)): Default implementation. (std::map::operator=(initializer_list)): Adapt to use _Rb_tree::_M_assign_unique. * include/bits/stl_multimap.h (std::multimap::operator=(std::multimap)): Default implementation. (std::multimap::operator=(initializer_list)): Adapt to use _Rb_tree::_M_assign_equal. * include/bits/stl_set.h (std::set::operator=(std::set)): Default implementation. (std::set::operator=(initializer_list)): Adapt to use _Rb_tree::_M_assign_unique. * include/bits/stl_multiset.h (std::multiset::operator=(std::multiset)): Default implementation. (std::multiset::operator=(initializer_list)): Adapt to use _Rb_tree::_M_assign_equal. * testsuite/23_containers/map/allocator/copy_assign.cc (test03): New. * testsuite/23_containers/map/allocator/init-list.cc: New. * testsuite/23_containers/map/allocator/move_assign.cc (test03): New. * testsuite/23_containers/multimap/allocator/copy_assign.cc (test03): New. * testsuite/23_containers/multimap/allocator/init-list.cc: New. * testsuite/23_containers/multimap/allocator/move_assign.cc (test03): New. * testsuite/23_containers/multiset/allocator/copy_assign.cc (test03): New. * testsuite/23_containers/multiset/allocator/init-list.cc: New. * testsuite/23_containers/multiset/allocator/move_assign.cc (test03): New. * testsuite/23_containers/set/allocator/copy_assign.cc (test03): New. * testsuite/23_containers/set/allocator/init-list.cc: New. * testsuite/23_containers/set/allocator/move_assign.cc (test03): New. Tested under linux x86_64. Ok to commit ? François On 16/06/2014 22:23, François Dumont wrote: Hi Here is another proposal taking into account your remarks except the one below. In fact I had no problem with lambda, I just needed to store them in a variable, lambda do not need to be made mutable. On 11/06/2014 14:02, Jonathan Wakely wrote: @@ -514,11 +651,11 @@ { return this-_M_impl._M_header._M_right; } _Link_type - _M_begin() _GLIBCXX_NOEXCEPT + _M_begin() const _GLIBCXX_NOEXCEPT { return static_cast_Link_type(this-_M_impl._M_header._M_parent); } What's the purpose of this change? Although it can be 'const' it is consistent with the usual begin()/end() functions that the functions returning a mutable iterator are non-const and the functions returning a constant iterator are const. _Const_Link_type - _M_begin() const _GLIBCXX_NOEXCEPT + _M_cbegin() const _GLIBCXX_NOEXCEPT { return static_cast_Const_Link_type (this-_M_impl._M_header._M_parent); @@ -529,7 +666,7 @@ { return reinterpret_cast_Link_type(this-_M_impl._M_header); } _Const_Link_type - _M_end() const _GLIBCXX_NOEXCEPT + _M_cend() const _GLIBCXX_NOEXCEPT { return reinterpret_cast_Const_Link_type(this-_M_impl._M_header); } static const_reference I'm not very comfortable with this renaming. Having consistent _M_begin() functions allows using them in template code that doesn't care if it's using the const or non-const version. I try to revert this part and so remember why I did it in the first place. I needed to change _M_copy signature to: _Link_type _M_copy(_Link_type __x, _Link_type __p) because I now use this method to also move the elements of the data structure, I cannot move from a _Const_Like_type so I change first parameter to _Link_type. I see that there are some code duplications to deal with _Const_Link_type and _Link_type in 2 different part of the code but I didn't want to duplicate again here and simply made _M_copy more flexible by taking a _Link_type rather than a _Const_Link_type. I don't really see interest of the existing code duplications so I prefer to not do the same and write the code only once. François
[PATCH] Do not remove labels with LABEL_PRESERVE_P
Hi, During my work on enabling pseudo PIC register I've found that cfg cleaunp may remove lables with LABEL_PRESERVE_P set to 1. In my case I generated SET_RIP during expand pass and cfg cleanup removed label it used as an operand. Below is a patch that fixes it. It is not actually required for our latest PIC related patch but still seems to make sense. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- 2014-09-19 Ilya Enkovich ilya.enkov...@intel.com * cfgcleanup.c (try_optimize_cfg): Do not remove label with LABEL_PRESERVE_P flag set. diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index a008168..9325ea0 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -2701,6 +2701,7 @@ try_optimize_cfg (int mode) (single_pred_edge (b)-flags EDGE_FALLTHRU) !(single_pred_edge (b)-flags EDGE_COMPLEX) LABEL_P (BB_HEAD (b)) + !LABEL_PRESERVE_P (BB_HEAD (b)) /* If the previous block ends with a branch to this block, we can't delete the label. Normally this is a condjump that is yet to be simplified, but
Re: [C/C++ PATCH] Handle enum bit-fields for -Wswitch (PR c/61405, PR c/53874)
On 09/19/2014 09:29 AM, Marek Polacek wrote: But we also started to warn on CPP_KEYWORD and two others: case value not in enumerated type. Fixed by moving CPP_KEYWORD, CPP_TEMPLATE_ID, and CPP_NESTED_NAME_SPECIFIER into enum cpp_ttype. Not sure if this is going to hurt something else. Wait, -Wswitch warns if any of the case labels don't correspond to enumerators? Apparently so, it complains about enum E { a=1,b=2,c=4 } e; int main() { switch (e) { case a|b: break; default: break; } } That seems bogus. The docs say, 'case' labels outside the enumeration range also provoke warnings when this option is used (even if there is a 'default' label). but 3 is within the range 1-4. If this is working as intended, I'm still uneasy about moving the C++ internal token types into libcpp; instead, I think I'd prefer to work around the warning by suppressing -Wswitch inside affected functions or moving the cases to ifs. Jason
Re: [PATCH, Pointer Bounds Checker 23/x] Function split
On 09/16/14 03:09, Ilya Enkovich wrote: I must be misunderstanding something then. I fundamentally don't see how the return bounds are any different here than the return value. If we have exposed the bounds in the IL, then aren't they going to be handled just like any other object in the IL? They are not handled like any other object in IL because return block and all statements in it are not handled as all other statements we put into split part. Here is a comment from find_return_bb: /* Return basic block containing RETURN statement. We allow basic blocks of the form: retval = tmp_var; return retval but return_bb can not be more complex than this. ... */ Phi nodes also may present in return_bb. Right. I've seen this stuff, but it's still not clear to me what the real issue is. The first thing that jumps out when I look at your dump is we don't have a PHI for __bound_tmp.322 in BB6. Now it may be that we really just wanted __bound_tmp.322_36, but that seems wrong as the return value varies depending on how we reach BB6 and it seems to me the bounds ought to vary in a similar manner. All blocks going to split part are analyzed by visit_bb function. Return basic block is not analyzed in the same way but still may be copied into split part in case return value is defined in it. There is a special code in visit_bb to add args of phi statements of return_bb as uses of split part to have no undefined values in copied block. It was enough when those phi args plus return value were only uses in return_bb. But now we add returned bounds to GIMPLE_RETURN as a new use and this new use is ignored. If split part returns value then return_bb will be copied into it. It means I should check returned bounds are defined there too. If SSA_NAME_DEF_STMT of returned bounds is in split part then it is OK. If SSA_NAME_DEF_STMT of returned bounds is in return_bb then it is also OK because it means it is a result of PHI node whose args were added as additional uses for split part earlier in visit_bb. At least that is how I think this happens :) Maybe you should post the IL for a case where this all matters and walk me through the key issues. I attach a dump I got from Chrome compilation with no additional checks restrictions in split. Original function returns value defined by phi node in return_bb and bounds defined in BB2. Split part contains BB3, BB4 and BB5 and resulting function part has usage of returned bounds but no producer for it. Right, but my question is whether or not the bounds from BB2 were really the correct bounds to be using in the first place! I would have expected a PHI in BB6 to select the bounds based on the path leading to BB6, much like we select a different return value. Jeff
Re: [PATCH] Do not remove labels with LABEL_PRESERVE_P
On 09/19/14 13:36, Ilya Enkovich wrote: Hi, During my work on enabling pseudo PIC register I've found that cfg cleaunp may remove lables with LABEL_PRESERVE_P set to 1. In my case I generated SET_RIP during expand pass and cfg cleanup removed label it used as an operand. Below is a patch that fixes it. It is not actually required for our latest PIC related patch but still seems to make sense. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- 2014-09-19 Ilya Enkovich ilya.enkov...@intel.com * cfgcleanup.c (try_optimize_cfg): Do not remove label with LABEL_PRESERVE_P flag set. OK. Please install. Note for those not following the x86 32 bit PIC register discussion, I asked Ilya to submit this separately. It was something an earlier version of his patch triggered and it stood out as something that ought to be fixed regardless of the final form of the PIC register changes that are in progress. Thanks, Jeff
Re: [PATCH] RTEMS: Update contrib/config-list.mk
On Thu, 2014-09-18 10:27:46 -0500, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 9/18/2014 6:51 AM, Jan-Benedict Glaw wrote: And to tell the whole story, Sebastian approached me with extending the target lists in use by those targets he sent a patch for; I just asked him to go this route, because I guess that'd be beneficial for other folks as well. OK. Thanks for clarifying that. I suspected there was a link. And it is committed. And I will post a follow up patch to add v850-elf and v850-rtems. Highly welcome :) MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Debugging is twice as hard as writing the code in the first place. the second : Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan signature.asc Description: Digital signature
Re: [PATCH] RTEMS: Update contrib/config-list.mk
On Thu, 2014-09-18 16:55:35 -0500, Joel Sherrill joel.sherr...@oarcorp.com wrote: I only see one RTEMS target that has been built in the top page. Are more than the powerpc-rtems being built? How can I check? The builders work down their list, and when done will start a new round (with a list that would now contain the three new targets.) Though they're still working down their old list; I've just added the targets there as well, they should show up soonish on one of the *-configlist_mk* builders. MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Ich hatte in letzter Zeit ein bißchen viel Realitycheck. the second : Langsam möchte ich mal wieder weiterträumen können. -- Maximilian Wilhelm (18. Mai 2005, #lug-owl.de) signature.asc Description: Digital signature
Re: [PATCH] Improve prepare_shrink_wrap to sink more instructions
On 09/15/14 08:33, Jiong Wang wrote: Jeff, thanks, I partially understand your meaning here. take the function ira_implicitly_set_insn_hard_regs in ira-lives.c for example, when generating address rtl, gcc will automatically generate const operator to prefix the address expression, like the following. so a simple CONSTANT_P check is enough in case there is no embedded register. (insn 309 310 308 3 (set (reg:DI 44 r15 [orig:94 ivtmp.674 ] [94]) (const:DI (plus:DI (symbol_ref:DI (recog_data) [flags 0x40] var_decl 0x2b2c2ff91510 recog_data) (const_int 480 [0x1e0] -1 but for architecture like aarch64, the following instruction sequences to forming address may be generated (insn 73 14 74 4 (set (reg/f:DI 20 x20 [99]) (high:DI (symbol_ref:DI (global_a) [flags 0xc0] var_decl 0x7ff755a60900 stats))) 35 {*movdi_aarch64} (expr_list:REG_EQUIV (high:DI (symbol_ref:DI (global_a) [flags 0xc0] var_decl 0x7ff755a60900 stats)) (nil))) (insn 17 30 25 5 (set (reg/f:DI 4 x4 [83]) (lo_sum:DI (reg/f:DI 20 x20 [99]) (symbol_ref:DI (global_a) [flags 0xc0] var_decl 0x7ff755a60900 stats))) {add_losym_di} (expr_list:REG_EQUIV (symbol_ref:DI (global_a) [flags 0xc0] var_decl 0x7ff755a60900 stats) (nil))) while CONSTANT_P could not catch the latter lo_sum case, as the RTX_CLASS of lo_sum is RTX_OBJ not RTX_CONST_OBJ, Hmm, it's been ~15 years since I regularly worked on a target that uses HIGH/LO_SUM, I thought we wrapped the LO_SUM expression inside a CONST as well, but reading the docs for CONST, that clearly isn't the case. Sorry for that. Can you (re) send your current patch for this for review? Jeff
Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations
Hi Richard, I also ran the gcc testsuite with RUNTESTFLAGS=--tool_opts=-mcopyrelocs to check for issues. The only test that failed was g++.dg/tsan/default_options.C. It uses -fpie -pie and BFD ld to link. Since BFD ld does not support copy relocations with -pie, it does not link. I linked with gold to make the test pass. Could you please take another look at this patch? Thanks Sri On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam tmsri...@google.com wrote: On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson r...@redhat.com wrote: On 06/20/2014 05:17 PM, Sriraman Tallam wrote: Index: config/i386/i386.c === --- config/i386/i386.c(revision 211826) +++ config/i386/i386.c(working copy) @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp) return true; } else if (!SYMBOL_REF_FAR_ADDR_P (op0) - SYMBOL_REF_LOCAL_P (op0) + (SYMBOL_REF_LOCAL_P (op0) +|| (TARGET_64BIT ix86_copyrelocs flag_pie + !SYMBOL_REF_FUNCTION_P (op0))) ix86_cmodel != CM_LARGE_PIC) return true; break; This is the wrong place to patch. You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified TARGET_BINDS_LOCAL_P. I have done this in the new attached patch, I added a new function i386_binds_local_p which will check for this and call default_binds_local_p otherwise. Note in particular that I believe that you are doing the wrong thing with weak and COMMON symbols, in that you probably ought not force a copy reloc there. I added an extra check to not do this for WEAK symbols. I also added a check for DECL_EXTERNAL so I believe this will also not be called for COMMON symbols. Note the complexity of default_binds_local_p_1, and the fact that all you really want to modify is /* If PIC, then assume that any global name can be overridden by symbols resolved from other modules. */ else if (shlib) local_p = false; near the bottom of that function. I did not understand what you mean here? Were you suggesting an alternative way of doing this? Thanks for reviewing Sri r~
[Patch bfin] Fixup use of constraints in define_split
Hi, As with the earlier patch for sh ( https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01627.html ), this fixes the fallout caused by https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01615.html. These are build failures, and the fixes are obvious, but I don't know my way around the failing ports, so I'd like an explicit maintainer ack. For testing, I've just checked that the build error is resolved. Ok? Thanks, James --- 2014-09-19 James Greenhalgh james.greenha...@arm.com * config/bfin/bfin.md: Fix use of constraints in define_split. diff --git a/gcc/config/bfin/bfin.md b/gcc/config/bfin/bfin.md index f5e64d3ef6914b408fa68b044ad122e676e2d7ff..9d57b9d3d392179effb68c1a511afaf8e0b43462 100644 --- a/gcc/config/bfin/bfin.md +++ b/gcc/config/bfin/bfin.md @@ -1970,15 +1970,15 @@ (define_insn loop_end (define_split [(set (pc) - (if_then_else (ne (match_operand:SI 0 nondp_reg_or_memory_operand ) + (if_then_else (ne (match_operand:SI 0 nondp_reg_or_memory_operand) (const_int 1)) - (label_ref (match_operand 1 )) + (label_ref (match_operand 1 )) (pc))) (set (match_dup 0) (plus (match_dup 0) (const_int -1))) (unspec [(const_int 0)] UNSPEC_LSETUP_END) - (clobber (match_scratch:SI 2 =r))] + (clobber (match_scratch:SI 2))] memory_operand (operands[0], SImode) || splitting_loops [(set (match_dup 2) (match_dup 0)) (set (match_dup 2) (plus:SI (match_dup 2) (const_int -1)))
Re: [PATCH][PING] Keep patch file permissions in mklog
On Fri, Sep 19, 2014 at 6:41 AM, Tom de Vries tom_devr...@mentor.com wrote: So it's a question of predictability (always do the same or do nothing) vs. robustness (do as much as you can given the circumstances). I'm not sure which one is better in this case. I think it's fine the way it is now. Thanks for the patch. Looks fine to me. Diego.
Fix g++.dg/cpp0x/static_assert9.C failure
Hi, this patch makes symbol==NULL comparsion to be resolved during parsing again. I made this foldable only after symbol table is built because during parsing the visibility of symbols may change affecting the outcome of folding. Sadly it seems that C++ FE needs this to be folded early. This patch adds flag to cgraph.h to refuse visibility changes once the folding happen and thus re-introduces the fact that some attributes needs to be set before symbol is used. Sadly I did not find other way around than declaring the testcase invalid. Bootstrapped/regtested x86_64-linux, comitted. Honza PR c++/61825 * c-family/c-common.c (handle_alias_ifunc_attribute): Check that visibility change is possible (handle_weakref_attribute): Likewise. * cgraph.h (symtab_node): Add method get_create and field refuse_visibility_changes. (symtab_node::get_create): New method. * fold-const.c (tree_single_nonzero_warnv_p): Use get_create. * varasm.c (mark_weak): Verify that visibility change is possible. * gcc.dg/tree-ssa/nonzero-1.c: Require error to be output. Index: c-family/c-common.c === --- c-family/c-common.c (revision 215401) +++ c-family/c-common.c (working copy) @@ -7757,6 +7757,19 @@ handle_alias_ifunc_attribute (bool is_al *no_add_attrs = true; } + if (decl_in_symtab_p (*node)) +{ + struct symtab_node *n = symtab_node::get (decl); + if (n n-refuse_visibility_changes) + { + if (is_alias) + error (%+D declared alias after being used, decl); + else + error (%+D declared ifunc after being used, decl); + } +} + + return NULL_TREE; } @@ -7833,6 +7846,13 @@ handle_weakref_attribute (tree *node, tr DECL_WEAK (*node) = 1; } + if (decl_in_symtab_p (*node)) +{ + struct symtab_node *n = symtab_node::get (*node); + if (n n-refuse_visibility_changes) + error (%+D declared weakref after being used, *node); +} + return NULL_TREE; } Index: cgraph.h === --- cgraph.h(revision 215401) +++ cgraph.h(working copy) @@ -346,6 +346,10 @@ public: return decl-decl_with_vis.symtab_node; } + /* Try to find a symtab node for declaration DECL and if it does not + exist or if it corresponds to an inline clone, create a new one. */ + static inline symtab_node * get_create (tree node); + /* Return the cgraph node that has ASMNAME for its DECL_ASSEMBLER_NAME. Return NULL if there's no such node. */ static symtab_node *get_for_asmname (const_tree asmname); @@ -394,7 +398,9 @@ public: unsigned analyzed : 1; /* Set for write-only variables. */ unsigned writeonly : 1; - + /* Visibility of symbol was used for further optimization; do not + permit further changes. */ + unsigned refuse_visibility_changes : 1; /*** Visibility and linkage flags. ***/ @@ -2519,4 +2525,12 @@ cgraph_node::mark_force_output (void) gcc_checking_assert (!global.inlined_to); } +inline symtab_node * symtab_node::get_create (tree node) +{ + if (TREE_CODE (node) == VAR_DECL) +return varpool_node::get_create (node); + else +return cgraph_node::get_create (node); +} + #endif /* GCC_CGRAPH_H */ Index: fold-const.c === --- fold-const.c(revision 215401) +++ fold-const.c(working copy) @@ -15850,7 +15850,7 @@ tree_single_nonzero_warnv_p (tree t, boo { struct symtab_node *symbol; - symbol = symtab_node::get (base); + symbol = symtab_node::get_create (base); if (symbol) return symbol-nonzero_address (); else Index: testsuite/gcc.dg/tree-ssa/nonzero-1.c === --- testsuite/gcc.dg/tree-ssa/nonzero-1.c (revision 215401) +++ testsuite/gcc.dg/tree-ssa/nonzero-1.c (working copy) @@ -1,11 +1,8 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-optimized } */ -extern int a; +/* { dg-options -O2 } */ +extern int a; /* { dg-error declared weak after being used } */ t() { return a!=0; } extern int a __attribute__ ((weak)); - -/* { dg-final { scan-tree-dump-not return 1 optimized} } */ -/* { dg-final { cleanup-tree-dump optimized } } */ Index: varasm.c === --- varasm.c(revision 215401) +++ varasm.c(working copy) @@ -5230,6 +5230,12 @@ output_constructor (tree exp, unsigned H static void mark_weak (tree decl) { + if (DECL_WEAK (decl)) +return; + + struct symtab_node *n = symtab_node::get (decl); + if (n n-refuse_visibility_changes) +error (%+D declared weak after being used, decl); DECL_WEAK (decl) = 1; if (DECL_RTL_SET_P (decl) Index:
C++ PATCH for c++/61392 (ICE mangling member without object)
This code was first allowed by DR 850: as of C++11, in unevaluated context it's OK to refer to a non-static data member without an associated object. The ABI says that this is mangled like an unresolved name, i.e. as the plain identifier. Tested x86_64-pc-linux-gnu, applying to trunk. commit e6a6cf41c76af773e573c00d8e7314c1223b0cff Author: Jason Merrill ja...@redhat.com Date: Fri Sep 19 06:15:02 2014 -0400 PR c++/61392 * mangle.c (write_expression): Use unresolved-name mangling for DR850 case. diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c index 5c201d6..1642da5 100644 --- a/gcc/cp/mangle.c +++ b/gcc/cp/mangle.c @@ -2866,11 +2866,16 @@ write_expression (tree expr) { write_string (operator_name_info[(int)code].mangled_name); ob = TREE_OPERAND (ob, 0); + write_expression (ob); } - else - write_string (dt); + else if (!is_dummy_object (ob)) + { + write_string (dt); + write_expression (ob); + } + /* else, for a non-static data member with no associated object (in + unevaluated context), use the unresolved-name mangling. */ - write_expression (ob); write_member_name (TREE_OPERAND (expr, 1)); return; } diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index bcf161b..1d81028 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -1685,7 +1685,7 @@ finish_non_static_data_member (tree decl, tree object, tree qualifying_scope) if (object == error_mark_node) return error_mark_node; - /* DR 613: Can use non-static data members without an associated + /* DR 613/850: Can use non-static data members without an associated object in sizeof/decltype/alignof. */ if (is_dummy_object (object) cp_unevaluated_operand == 0 (!processing_template_decl || !current_class_ref)) diff --git a/gcc/testsuite/g++.dg/abi/mangle63.C b/gcc/testsuite/g++.dg/abi/mangle63.C new file mode 100644 index 000..d6a58a3 --- /dev/null +++ b/gcc/testsuite/g++.dg/abi/mangle63.C @@ -0,0 +1,11 @@ +// DR 850 makes this valid +// { dg-do compile { target c++11 } } + +templateclass T struct A +{ + int mem; + templateclass U decltype(U()+mem) f(); +}; +int i = Aint().fint(); + +// { dg-final { scan-assembler _ZN1AIiE1fIiEEDTplcvT__E3memEv } }
C++ PATCH for c++/61465 (unused-but-set and init-list)
mark_exp_read doesn't know how to deal with a CONSTRUCTOR. In this case it doesn't need to, since we're about to pull out the single element anyway, so let's do that before calling mark_exp_read. Tested x86_64-pc-linux-gnu, applying to trunk. commit cd87c48098daa3f1f31a7877d6fb0bf799b10d7b Author: Jason Merrill ja...@redhat.com Date: Thu Sep 18 14:05:55 2014 -0400 PR c++/61465 * call.c (convert_like_real) [ck_identity]: Call mark_rvalue_use after pulling out an element from a CONSTRUCTOR. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 161235b..8f1b91a 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -6206,7 +6206,6 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, return expr; } case ck_identity: - expr = mark_rvalue_use (expr); if (BRACE_ENCLOSED_INITIALIZER_P (expr)) { int nelts = CONSTRUCTOR_NELTS (expr); @@ -6217,6 +6216,7 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, else gcc_unreachable (); } + expr = mark_rvalue_use (expr); if (type_unknown_p (expr)) expr = instantiate_type (totype, expr, complain); diff --git a/gcc/testsuite/g++.dg/warn/Wunused-parm-6.C b/gcc/testsuite/g++.dg/warn/Wunused-parm-6.C new file mode 100644 index 000..95fb7e2 --- /dev/null +++ b/gcc/testsuite/g++.dg/warn/Wunused-parm-6.C @@ -0,0 +1,8 @@ +// PR c++/61465 +// { dg-do compile { target c++11 } } +// { dg-options -Wunused-but-set-parameter } + +struct Foo { + Foo(void* x) : y{static_castchar*(x)} {} + char* y; +};
Fix ipareference-1 testcase
Hi, this patch fixes ordering issue that prevents ipa-reference folding with LTO Bootstrapped/regtested x86_64-linux, comitted. Honza PR tree-optimization/63255 * ipa.c (symbol_table::remove_unreachable_nodes): Fix ordering issue in setting body_removed flag. Index: ipa.c === --- ipa.c (revision 215409) +++ ipa.c (working copy) @@ -538,6 +538,11 @@ fprintf (file, %s, vnode-name ()); changed = true; } + /* Keep body if it may be useful for constant folding. */ + if ((init = ctor_for_folding (vnode-decl)) == error_mark_node) + vnode-remove_initializer (); + else + DECL_INITIAL (vnode-decl) = init; vnode-body_removed = true; vnode-definition = false; vnode-analyzed = false; @@ -545,11 +550,6 @@ vnode-remove_from_same_comdat_group (); - /* Keep body if it may be useful for constant folding. */ - if ((init = ctor_for_folding (vnode-decl)) == error_mark_node) - vnode-remove_initializer (); - else - DECL_INITIAL (vnode-decl) = init; vnode-remove_all_references (); } else
Avoid privatization of TLS variables
Hi, libreoffice fails to build with TLS because of Cannot load any more object with static TLS. Iant pointed out to me the difference that the initial exec TLS model is also used by static TLS variables. This patch prevents turning TLS variables into static that lets me to finish libreoffice build. Can the conditional be strenghtened somewhat? decl_default_tls_model has: if (!flag_shlib) { if (is_local) kind = TLS_MODEL_LOCAL_EXEC; else kind = TLS_MODEL_INITIAL_EXEC; } /* Local dynamic is inefficient when we're not combining the parts of the address. */ else if (optimize is_local) kind = TLS_MODEL_LOCAL_DYNAMIC; else kind = TLS_MODEL_GLOBAL_DYNAMIC; perhaps we should have fake TLS_MODEL_EXEC and TLS_MODE_DYNAMIC modes that get more specified later once the visibility is finalized instead of deciding it at compile time? Bootstrapped/regtested x86_64-linux, comitted for now until we get better solution (if it exists). Honza Index: ChangeLog === --- ChangeLog (revision 215416) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2014-09-19 Jan Hubicka hubi...@ucw.cz + * ipa-visibility.c (varpool_node::externally_visible_p): Do not + privatize dynamic TLS variables. + +2014-09-19 Jan Hubicka hubi...@ucw.cz + * diagnostic.c (warning_n): New function. * diagnostic-core.h (warning_n): Declare. * ipa-devirt.c (ipa_devirt): Handle singulars correctly; Index: ipa-visibility.c === --- ipa-visibility.c(revision 215415) +++ ipa-visibility.c(working copy) @@ -277,6 +277,13 @@ varpool_node::externally_visible_p (void if (used_from_object_file_p ()) return true; + /* Bringing TLS variables local may cause dynamic linker failures + on limits of static TLS vars. */ + if (DECL_THREAD_LOCAL_P (decl) + (DECL_TLS_MODEL (decl) != TLS_MODEL_EMULATED + DECL_TLS_MODEL (decl) != TLS_MODEL_INITIAL_EXEC)) +return true; + if (DECL_HARD_REGISTER (decl)) return true; if (DECL_PRESERVE_P (decl))
Speedup int_bit_from_pos
Hi, int_bit_position is used by ipa-devirt's type walking code. It is currently a bottleneck since I introduced speculation into contextes (I plan to solve this by changing the way i cache results). But this patch seems to make sense anyway: we do not need to go through folding: tree bit_from_pos (tree offset, tree bitpos) { if (TREE_CODE (offset) == PLUS_EXPR) offset = size_binop (PLUS_EXPR, fold_convert (bitsizetype, TREE_OPERAND (offset, 0)), fold_convert (bitsizetype, TREE_OPERAND (offset, 1))); else offset = fold_convert (bitsizetype, offset); return size_binop (PLUS_EXPR, bitpos, size_binop (MULT_EXPR, offset, bitsize_unit_node)); } Because all the code cares only about constant offsets, we do not need to go through fold_convert, because all the codes go via int_bit_position that already expects result to be host wide int, it seems to make sense to implement quick path for that. Bootstrap/regtest x86_64 in progress, OK? Honza * stor-layout.c (int_bit_from_pos): New function. * stor-layout.h (int_bit_from_pos): Declare. * tree.c (int_bit_from_pos): Use it. Index: stor-layout.c === --- stor-layout.c (revision 215409) +++ stor-layout.c (working copy) @@ -858,6 +858,20 @@ size_binop (MULT_EXPR, offset, bitsize_unit_node)); } +/* Like int_bit_from_pos, but return the result as HOST_WIDE_INT. + OFFSET and BITPOS must be constant. */ + +HOST_WIDE_INT +int_bit_from_pos (tree offset, tree bitpos) +{ + HOST_WIDE_INT off; + if (TREE_CODE (offset) == PLUS_EXPR) +off = tree_to_shwi (TREE_OPERAND (offset, 0)) + tree_to_shwi (TREE_OPERAND (offset, 1)); + else +off = tree_to_shwi (offset); + return off * BITS_PER_UNIT + tree_to_shwi (bitpos); +} + /* Return the combined truncated byte position for the byte offset OFFSET and the bit position BITPOS. */ Index: stor-layout.h === --- stor-layout.h (revision 215409) +++ stor-layout.h (working copy) @@ -27,6 +27,7 @@ unsigned int); extern record_layout_info start_record_layout (tree); extern tree bit_from_pos (tree, tree); +extern HOST_WIDE_INT int_bit_from_pos (tree, tree); extern tree byte_from_pos (tree, tree); extern void pos_from_bit (tree *, tree *, unsigned int, tree); extern void normalize_offset (tree *, tree *, unsigned int); Index: tree.c === --- tree.c (revision 215409) +++ tree.c (working copy) @@ -2839,7 +2839,8 @@ HOST_WIDE_INT int_bit_position (const_tree field) { - return tree_to_shwi (bit_position (field)); + return int_bit_from_pos (DECL_FIELD_OFFSET (field), + DECL_FIELD_BIT_OFFSET (field)); } /* Return the byte position of FIELD, in bytes from the start of the record.