Re: [PATCH] PR 78534 Change character length from int to size_t
On Thu, Jan 4, 2018 at 4:21 AM, Jerry DeLislewrote: > On 01/03/2018 03:37 AM, Janne Blomqvist wrote: >> On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLisle >> wrote: >>> On 12/30/2017 12:35 PM, Janne Blomqvist wrote: On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig wrote: >>> ---snip--- I can provide that stuff as a separate patch, or merge it into the original megapatch and resubmit that, whichever way you prefer. >>> >>> I would prefer we split into two patches. This will make review of the >>> library >>> I/O changes easier. The int len is used in a lot of places also where it >>> really >>> happens to also be the kind (which is a length in our implementation). >> >> Hi, >> >> attached is a patch that makes the two attached testcases work. It >> applies on top of the charlen->size_t patch. In the formatted I/O >> stuff, I have mostly used ptrdiff_t to avoid having to deal with >> signed/unsigned issues, as the previous code was using int. >> >> >> > > Well I have started reviewing. One concern I have is that in many places you > are changing formatted field widths, like w in read_l, to ptrdiff_t. I don't > think this is necessary. If someone is printing a value that has a field width > that big, it will never finish. > > I did not check the definitions of format data structures that use these > values > all over. So I think in the least, the patch goes too far. I don't expect to support format widths larger than INT_MAX. The reason why ptrdiff_t is used is that read_block_form_* and many similar functions take a pointer to a ptrdiff_t (to handle the one case where a scalar variable can be wider than INT_MAX, namely a character), and we can't pass a pointer to an incompatible type. Now, you could argue that the internal API should be refactored so that we pass integers by value rather than as pointers, and then only the character case would need to care to use a larger type and the rest could keep using plain int. And I would agree that would be good, but I think such a refactor would be stage 1 material, not stage 3. > Another issue I have is that the readability of the code just sucks to me. The > type ptrdiff_t is so not intuitive in a very many places. If this is really > necessary lets hide it behind a macro or something so I don't have to cringe > every time I look at this code. (such as gfc_int) > > Sometimes we can get too pedantic about things like this. A lot of these > variables have nothing to do with pointers, though they may be used as > modifiers > to offsets in large memory spaces. Well, ptrdiff_t is one of the two "pointer-sized integer" typedefs introduced in C89, the other being size_t. So I would argue that if one is doing stuff like pointer arithmetic, or calculating array indices etc., and the variable might for some reason be negative, ptrdiff_t is the natural type to use. I do agree that ptrdiff_t is a bit of a mouthful, but I don't have any really good alternatives either; I'm not convinced that a libgfortran-specific typedef would make things clearer. Or we can blame Microsoft which made win64 LLP64 and not LP64 like the rest of the world, and thus spoiled the use of long for portable code. :) FWIW, we have "index_type" which is a typedef for ptrdiff_t, but that's used mostly in code that deals with array descriptors, so I thought it would be cleaner to not use it here. -- Janne Blomqvist
Re: [PATCH] PR 78534 Change character length from int to size_t
On 01/03/2018 03:37 AM, Janne Blomqvist wrote: > On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLislewrote: >> On 12/30/2017 12:35 PM, Janne Blomqvist wrote: >>> On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig >>> wrote: >> ---snip--- >>> >>> I can provide that stuff as a separate patch, or merge it into the >>> original megapatch and resubmit that, whichever way you prefer. >> >> I would prefer we split into two patches. This will make review of the >> library >> I/O changes easier. The int len is used in a lot of places also where it >> really >> happens to also be the kind (which is a length in our implementation). > > Hi, > > attached is a patch that makes the two attached testcases work. It > applies on top of the charlen->size_t patch. In the formatted I/O > stuff, I have mostly used ptrdiff_t to avoid having to deal with > signed/unsigned issues, as the previous code was using int. > > > Well I have started reviewing. One concern I have is that in many places you are changing formatted field widths, like w in read_l, to ptrdiff_t. I don't think this is necessary. If someone is printing a value that has a field width that big, it will never finish. I did not check the definitions of format data structures that use these values all over. So I think in the least, the patch goes too far. Another issue I have is that the readability of the code just sucks to me. The type ptrdiff_t is so not intuitive in a very many places. If this is really necessary lets hide it behind a macro or something so I don't have to cringe every time I look at this code. (such as gfc_int) Sometimes we can get too pedantic about things like this. A lot of these variables have nothing to do with pointers, though they may be used as modifiers to offsets in large memory spaces. I need a day or two to go through all of this. Sorry if I am just a downer about it. Regards, Jerry
Re: [PATCH] avoid using %lli et al.
On 01/03/2018 04:47 PM, Jakub Jelinek wrote: On Tue, Jan 02, 2018 at 04:40:50PM -0700, Jeff Law wrote: Attached is the patch with the casts removed (still bootstrapping). Martin gcc-printf-lli.diff gcc/ChangeLog: * gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): Use offset_int::from instead of wide_int::to_shwi. (maybe_diag_overlap): Remove assertion. Use HOST_WIDE_INT_PRINT_DEC instead of %lli. * gimple-ssa-sprintf.c (format_directive): Same. (parse_directive): Same. (sprintf_dom_walker::compute_format_length): Same. (try_substitute_return_value): Same. gcc/testsuite/ChangeLog: * gcc.dg/Wrestrict-3.c: New test. OK. This broke bootstrap on i686-linux, dir.len is size_t, which isn't always appropriate for HOST_WIDE_INT_PRINT_DEC format. Fixed thusly, committed to trunk as obvious after i686-linux bootstrap went past the previous failure point. Thanks. This is an example where having a solution for bug 78014 would be helpful. I.e., a -Wformat checker to help enforce the use of %zu instead of %u or %lu (or an explicit cast from size_t) even on targets size_t is unsigned or unsigned long, respectively. Martin 2018-01-04 Jakub Jelinek* gimple-ssa-sprintf.c (parse_directive): Cast second dir.len to uhwi. --- gcc/gimple-ssa-sprintf.c.jj 2018-01-03 22:24:36.0 +0100 +++ gcc/gimple-ssa-sprintf.c2018-01-04 00:15:55.950179583 +0100 @@ -3040,7 +3040,7 @@ parse_directive (sprintf_dom_walker::cal "length = " HOST_WIDE_INT_PRINT_UNSIGNED "\n", dir.dirno, (unsigned HOST_WIDE_INT)(size_t)(dir.beg - info.fmtstr), - (int)dir.len, dir.beg, dir.len); + (int)dir.len, dir.beg, (unsigned HOST_WIDE_INT) dir.len); } return len - !*str; Jakub
Re: [PATCH] handle invalid calls to built-ins with no prototype (PR 83603)
On 01/02/2018 04:29 PM, Jeff Law wrote: On 01/01/2018 05:58 PM, Martin Sebor wrote: The -Wrestrict code assumes that built-ins are called with the correct number of arguments. When this isn't so it crashes. The attached patch avoids the ICE due to this error. There are outstanding assumptions that the type of actual arguments to built-ins declared without a prototype matches the expected type. Those will be corrected in a subsequent patch. Martin gcc-83603.diff PR tree-optimization/83603 - ICE in builtin_memref at gcc/gimple-ssa-warn-restrict.c:238 gcc/ChangeLog: PR tree-optimization/83603 * calls.c (maybe_warn_nonstring_arg): Avoid accessing function arguments past the endof the argument list in functions declared without a prototype. * gimple-ssa-warn-restrict.c (wrestrict_dom_walker::check_call): Avoid checking when arguments are null. OK. Presumably we're not getting syntax errors precisely because we're calling these functions without an in-scope prototype? Right. I wouldn't at all be surprised to find other passes mis-handling that case. The idioms you're using to get the arguments are similar to code I've seen elsewhere -- it's probably just harder to trigger the failures because analysis of the mem* or str* call is conditional on surrounding context. Other passes may be relying on gimple_call_builtin_p to determine whether a call is to a built-in. The function returns false for calls with incompatible arguments (not just when passing a pointer where an integer is expected, for example, but also when passing an int where a size_t is expected, such as in a call to memcpy), and so it exempts many otherwise correct calls from checking (and also from the benefits of optimization, which also implies less thorough checking). Martin
Re: [PATCH] avoid using %lli et al.
On Tue, Jan 02, 2018 at 04:40:50PM -0700, Jeff Law wrote: > > Attached is the patch with the casts removed (still bootstrapping). > > > > Martin > > > > gcc-printf-lli.diff > > > > > > gcc/ChangeLog: > > > > * gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): Use > > offset_int::from instead of wide_int::to_shwi. > > (maybe_diag_overlap): Remove assertion. > > Use HOST_WIDE_INT_PRINT_DEC instead of %lli. > > * gimple-ssa-sprintf.c (format_directive): Same. > > (parse_directive): Same. > > (sprintf_dom_walker::compute_format_length): Same. > > (try_substitute_return_value): Same. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/Wrestrict-3.c: New test. > OK. This broke bootstrap on i686-linux, dir.len is size_t, which isn't always appropriate for HOST_WIDE_INT_PRINT_DEC format. Fixed thusly, committed to trunk as obvious after i686-linux bootstrap went past the previous failure point. 2018-01-04 Jakub Jelinek* gimple-ssa-sprintf.c (parse_directive): Cast second dir.len to uhwi. --- gcc/gimple-ssa-sprintf.c.jj 2018-01-03 22:24:36.0 +0100 +++ gcc/gimple-ssa-sprintf.c2018-01-04 00:15:55.950179583 +0100 @@ -3040,7 +3040,7 @@ parse_directive (sprintf_dom_walker::cal "length = " HOST_WIDE_INT_PRINT_UNSIGNED "\n", dir.dirno, (unsigned HOST_WIDE_INT)(size_t)(dir.beg - info.fmtstr), - (int)dir.len, dir.beg, dir.len); + (int)dir.len, dir.beg, (unsigned HOST_WIDE_INT) dir.len); } return len - !*str; Jakub
[committed] Allow the target to set MAX_BITSIZE_MODE_ANY_MODE
The default value of MAX_BITSIZE_MODE_ANY_MODE is calculated from the initial mode sizes specified in the modes.def file. The target needs to be able to override it if ADJUST_BYTESIZE & co. can choose a bigger size. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Also tested by comparing the before and after assembly output for at least one target per CPU directory. Applied (as a gen* patch). Richard 2018-01-03 Richard Sandifordgcc/ * doc/rtl.texi (MAX_BITSIZE_MODE_ANY_MODE): Describe how the default is calculated and how it can be overridden. * genmodes.c (max_bitsize_mode_any_mode): New variable. (create_modes): Initialize it from MAX_BITSIZE_MODE_ANY_MODE, if defined. (emit_max_int): Use it to set the output MAX_BITSIZE_MODE_ANY_MODE, if nonzero. Index: gcc/doc/rtl.texi === --- gcc/doc/rtl.texi2018-01-03 09:44:10.691921247 + +++ gcc/doc/rtl.texi2018-01-03 09:44:10.854914708 + @@ -1509,7 +1509,12 @@ compute integer values. @findex MAX_BITSIZE_MODE_ANY_MODE @item MAX_BITSIZE_MODE_ANY_MODE -The bitsize of the largest mode on the target. +The bitsize of the largest mode on the target. The default value is +the largest mode size given in the mode definition file, which is +always correct for targets whose modes have a fixed size. Targets +that might increase the size of a mode beyond this default should define +@code{MAX_BITSIZE_MODE_ANY_MODE} to the actual upper limit in +@file{@var{machine}-modes.def}. @end table @findex byte_mode Index: gcc/genmodes.c === --- gcc/genmodes.c 2018-01-03 09:44:10.691921247 + +++ gcc/genmodes.c 2018-01-03 09:44:10.854914708 + @@ -792,6 +792,7 @@ #define ADJUST_FBIT(M, X) _ADD_ADJUST ( static int bits_per_unit; static int max_bitsize_mode_any_int; +static int max_bitsize_mode_any_mode; static void create_modes (void) @@ -811,6 +812,12 @@ create_modes (void) #else max_bitsize_mode_any_int = 0; #endif + +#ifdef MAX_BITSIZE_MODE_ANY_MODE + max_bitsize_mode_any_mode = MAX_BITSIZE_MODE_ANY_MODE; +#else + max_bitsize_mode_any_mode = 0; +#endif } #ifndef NUM_POLY_INT_COEFFS @@ -989,12 +996,18 @@ emit_max_int (void) else printf ("#define MAX_BITSIZE_MODE_ANY_INT %d\n", max_bitsize_mode_any_int); - mmax = 0; - for (j = 0; j < MAX_MODE_CLASS; j++) -for (i = modes[j]; i; i = i->next) - if (mmax < i->bytesize) - mmax = i->bytesize; - printf ("#define MAX_BITSIZE_MODE_ANY_MODE (%d*BITS_PER_UNIT)\n", mmax); + if (max_bitsize_mode_any_mode == 0) +{ + mmax = 0; + for (j = 0; j < MAX_MODE_CLASS; j++) + for (i = modes[j]; i; i = i->next) + if (mmax < i->bytesize) + mmax = i->bytesize; + printf ("#define MAX_BITSIZE_MODE_ANY_MODE (%d*BITS_PER_UNIT)\n", mmax); +} + else +printf ("#define MAX_BITSIZE_MODE_ANY_MODE %d\n", + max_bitsize_mode_any_mode); } /* Emit mode_size_inline routine into insn-modes.h header. */
[committed] Use partial_subreg_p in curr_insn_transform
Use partial_subreg_p in code that was added since the initial patch that introduced this function. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Also tested by comparing the before and after assembly output for at least one target per CPU directory. Committed as obvious. Richard 2018-01-03 Richard Sandifordgcc/ * lra-constraints.c (curr_insn_transform): Use partial_subreg_p. Index: gcc/lra-constraints.c === --- gcc/lra-constraints.c 2018-01-03 07:18:30.797822724 + +++ gcc/lra-constraints.c 2018-01-03 09:34:11.747070247 + @@ -4243,8 +4243,7 @@ curr_insn_transform (bool check_only_p) || (simplify_subreg_regno (ira_class_hard_regs[goal_alt[i]][0], GET_MODE (reg), byte, mode) >= 0))) - || (GET_MODE_PRECISION (mode) - < GET_MODE_PRECISION (GET_MODE (reg)) + || (partial_subreg_p (mode, GET_MODE (reg)) && GET_MODE_SIZE (GET_MODE (reg)) <= UNITS_PER_WORD && WORD_REGISTER_OPERATIONS))) {
Re: [PATCH] Fix gcc.dg/vect-opt-info-1.c testcase
Jakub Jelinekwrites: > On Mon, Oct 23, 2017 at 06:26:12PM +0100, Richard Sandiford wrote: >> 2017-10-23 Richard Sandiford >> Alan Hayward >> David Sherwood > ... > >> --- /dev/null2017-10-21 08:51:42.385141415 +0100 >> +++ gcc/testsuite/gcc.dg/vect-opt-info-1.c 2017-10-23 >> 17:22:26.571498977 +0100 >> @@ -0,0 +1,11 @@ >> +/* { dg-options "-std=c99 -fopt-info -O3" } */ >> + >> +void >> +vadd (int *dst, int *op1, int *op2, int count) >> +{ >> + for (int i = 0; i < count; ++i) >> +dst[i] = op1[i] + op2[i]; >> +} >> + >> +/* { dg-message "loop vectorized" "" { target *-*-* } 6 } */ >> +/* { dg-message "loop versioned for vectorization because of possible >> aliasing" "" { target *-*-* } 6 } */ > > This testcase fails e.g. on i686-linux. The problem is > 1) it really should be at least guarded with > /* { dg-do compile { target vect_int } } */ > because on targets that can't vectorize even simple int operations > this will obviously fail Hmm, yeah. > 2) that won't help for i686 though, because we need -msse2 added > to options for it to work; that is normally added by > check_vect_support_and_set_flags > only when in vect.exp. If it was just that target, we could add > dg-additional-options, but I'm afraid many other targets add some options. > > The following works for me, calling it nodump-* ensures that > -fdump-tree-* isn't added, which I believe is essential for the testcase; Yeah, that's right, the bug was using dump_file when dump_enabled_p (), which would segfault when -fopt-info was passed and -fdump-tree-vect* wasn't. > tested on x86_64-linux with > RUNTESTFLAGS='--target_board=unix\{-m32,-m32/-mno-sse,-m64\} vect.exp=nodump*' > ok for trunk? > > Sadly I don't have your broken development version of the patch, so can't > verify it fails with the broken patch. Me neither any more, but it looks good to me FWIW. Thanks, Richard
[PATCH] minor tweak to complete strlen fix for PR83501
Prathamesh's fix restores the optimization for the test case reported in the bug (thanks!) but it isn't sufficient to bring GCC 8 completely up to par with 7. Prior GCC versions are able to compute the string length in the test case below but GCC 8 cannot. char d[8]; const char s[] = "1234567"; void f (void) { __builtin_strcpy (d, s); if (__builtin_strlen (d) != sizeof s - 1) __builtin_abort (); } The attached patch slightly tweaks Prathamesh's patch to also handle the case above. There still are outstanding cases where folding into MEM_REF defeats the strlen pass (I've raised a couple of new bugs to track those I noticed today) but, AFAICT, those aren't recent regressions so they can be dealt with later and separately. Martin PS I've renamed the function because I had initially thought it also needed to handle PHIs that refer to strings of the same length (and there isn't necessarily one "right" string to return) but then realized that the lack of that handling isn't part of the regression so I didn't include it but kept the new return type and name. When the pass is enhanced to handle PHIs and some of the other cases it doesn't handle the function won't need to change. PR tree-optimization/83501 - strlen(a) not folded after strcpy(a, ...) gcc/ChangeLog: PR tree-optimization/83501 * tree-ssa-strlen.c (get_string_cst): Rename... (get_strlen): ...to this. Handle global constants. (handle_char_store): Adjust. gcc/testsuite/ChangeLog: PR tree-optimization/83501 * gcc.dg/strlenopt-39.c: New test. Index: gcc/testsuite/gcc.dg/strlenopt-39.c === --- gcc/testsuite/gcc.dg/strlenopt-39.c (nonexistent) +++ gcc/testsuite/gcc.dg/strlenopt-39.c (working copy) @@ -0,0 +1,66 @@ +/* PR tree-optimization/83444 + { dg-do compile } + { dg-options "-O2 -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +#define STR "1234567" + +const char str[] = STR; + +char dst[10]; + +void copy_from_global_str (void) +{ + strcpy (dst, str); + + if (strlen (dst) != sizeof str - 1) +abort (); +} + +void copy_from_local_str (void) +{ + const char s[] = STR; + + strcpy (dst, s); + + if (strlen (dst) != sizeof s - 1) +abort (); +} + +void copy_from_local_memstr (void) +{ + struct { +char s[sizeof STR]; + } x = { STR }; + + strcpy (dst, x.s); + + if (strlen (dst) != sizeof x.s - 1) +abort (); +} + +void copy_to_local_str (void) +{ + char d[sizeof STR]; + + strcpy (d, str); + + if (strlen (d) != sizeof str - 1) +abort (); +} + +void copy_to_local_memstr (void) +{ + struct { +char d[sizeof STR]; + } x; + + strcpy (x.d, str); + + if (strlen (x.d) != sizeof str- 1) +abort (); +} + +/* Verify that all calls to strlen have been eliminated. + { dg-final { scan-tree-dump-not "(abort|strlen) \\(\\)" "optimized" } } */ Index: gcc/tree-ssa-strlen.c === --- gcc/tree-ssa-strlen.c (revision 256180) +++ gcc/tree-ssa-strlen.c (working copy) @@ -2772,9 +2772,11 @@ handle_pointer_plus (gimple_stmt_iterator *gsi) } } -/* Check if RHS is string_cst possibly wrapped by mem_ref. */ -static tree -get_string_cst (tree rhs) +/* If RHS, either directly or indirectly, refers to a string of constant + length, return it. Otherwise return a negative value. */ + +static int +get_strlen (tree rhs) { if (TREE_CODE (rhs) == MEM_REF && integer_zerop (TREE_OPERAND (rhs, 1))) @@ -2784,7 +2786,17 @@ handle_pointer_plus (gimple_stmt_iterator *gsi) rhs = TREE_OPERAND (rhs, 0); } - return (TREE_CODE (rhs) == STRING_CST) ? rhs : NULL_TREE; + if (TREE_CODE (rhs) == VAR_DECL + && TREE_READONLY (rhs)) +rhs = DECL_INITIAL (rhs); + + if (rhs && TREE_CODE (rhs) == STRING_CST) +{ + unsigned HOST_WIDE_INT ilen = strlen (TREE_STRING_POINTER (rhs)); + return ilen <= INT_MAX ? ilen : -1; +} + + return -1; } /* Handle a single character store. */ @@ -2799,6 +2811,9 @@ handle_char_store (gimple_stmt_iterator *gsi) tree rhs = gimple_assign_rhs1 (stmt); unsigned HOST_WIDE_INT offset = 0; + /* Set to the length of the string being assigned if known. */ + int rhslen; + if (TREE_CODE (lhs) == MEM_REF && TREE_CODE (TREE_OPERAND (lhs, 0)) == SSA_NAME) { @@ -2942,19 +2957,18 @@ handle_char_store (gimple_stmt_iterator *gsi) } } else if (idx == 0 - && (rhs = get_string_cst (gimple_assign_rhs1 (stmt))) + && (rhslen = get_strlen (gimple_assign_rhs1 (stmt))) >= 0 && ssaname == NULL_TREE && TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE) { - size_t l = strlen (TREE_STRING_POINTER (rhs)); HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs)); - if (a > 0 && (unsigned HOST_WIDE_INT) a > l) + if (a > 0 && (unsigned HOST_WIDE_INT) a > (unsigned HOST_WIDE_INT) rhslen) { int idx = new_addr_stridx (lhs); if (idx != 0) { si =
Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64
On 01/03/2018 01:49 PM, Jeff Law wrote: > On 01/03/2018 01:36 PM, Jakub Jelinek wrote: >> On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote: Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? 2018-01-03 Jakub JelinekPR target/83641 * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp and add REG_CFA_ADJUST_CFA notes in that case to both insns. >>> I still question how useful this part is, but not enough to object given >> >> Reduces bloat in .eh_frame and when unwinding less work is needed. The >> patch isn't that large after all. >> >>> you've done the work. I'll go ahead and commit both as a single unit. >> >> BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and >> my patch, the only regression caused by your patch is: >> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi >> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi >> on i686-linux, quite understandably, that is exactly the case you are >> changing, there is %edi clobbered in the function and thus needs to be saved >> and restored and so no push/pop of %esi is needed. >> So, this testcase should be tweaked. > I think the ASM can just be dropped. It doesn't contribute anything to > what Florian was trying to show. Once we drop the ASM we should see > those probes return. I'll have to check that obviously. Here's the final patch -- only changes are smashing Jakub's and my work together, adding a comment and the stack-check-12 tweak mentioned above. Committing to the trunk. jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8df8f232d80..7aa0920fc65 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,17 @@ +2017-01-03 Jakub Jelinek + Jeff Law + + PR target/83641 + * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For + noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, + only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp + and add REG_CFA_ADJUST_CFA notes in that case to both insns. + + PR target/83641 + * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): Do not + explicitly probe *sp in a noreturn function if there were any callee + register saves or frame pointer is needed. + 2018-01-03 Jakub Jelinek PR debug/83621 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 56baaa7e5e8..c363de93e02 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12217,21 +12217,39 @@ ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size) pointer could be anywhere in the guard page. The safe thing to do is emit a probe now. + The probe can be avoided if we have already emitted any callee + register saves into the stack or have a frame pointer (which will + have been saved as well). Those saves will function as implicit + probes. + ?!? This should be revamped to work like aarch64 and s390 where we track the offset from the most recent probe. Normally that offset would be zero. For a noreturn function we would reset it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT). Then we just probe when we cross PROBE_INTERVAL. */ - if (TREE_THIS_VOLATILE (cfun->decl)) + if (TREE_THIS_VOLATILE (cfun->decl) + && !(m->frame.nregs || m->frame.nsseregs || frame_pointer_needed)) { /* We can safely use any register here since we're just going to push its value and immediately pop it back. But we do try and avoid argument passing registers so as not to introduce dependencies in the pipeline. For 32 bit we use %esi and for 64 bit we use %rax. */ rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG); - rtx_insn *insn = emit_insn (gen_push (dummy_reg)); - RTX_FRAME_RELATED_P (insn) = 1; - ix86_emit_restore_reg_using_pop (dummy_reg); + rtx_insn *insn_push = emit_insn (gen_push (dummy_reg)); + rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg)); + m->fs.sp_offset -= UNITS_PER_WORD; + if (m->fs.cfa_reg == stack_pointer_rtx) + { + m->fs.cfa_offset -= UNITS_PER_WORD; + rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_push) = 1; + x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_pop) = 1; + }
[PATCH] Fix gcc.dg/vect-opt-info-1.c testcase
On Mon, Oct 23, 2017 at 06:26:12PM +0100, Richard Sandiford wrote: > 2017-10-23 Richard Sandiford> Alan Hayward > David Sherwood ... > --- /dev/null 2017-10-21 08:51:42.385141415 +0100 > +++ gcc/testsuite/gcc.dg/vect-opt-info-1.c2017-10-23 17:22:26.571498977 > +0100 > @@ -0,0 +1,11 @@ > +/* { dg-options "-std=c99 -fopt-info -O3" } */ > + > +void > +vadd (int *dst, int *op1, int *op2, int count) > +{ > + for (int i = 0; i < count; ++i) > +dst[i] = op1[i] + op2[i]; > +} > + > +/* { dg-message "loop vectorized" "" { target *-*-* } 6 } */ > +/* { dg-message "loop versioned for vectorization because of possible > aliasing" "" { target *-*-* } 6 } */ This testcase fails e.g. on i686-linux. The problem is 1) it really should be at least guarded with /* { dg-do compile { target vect_int } } */ because on targets that can't vectorize even simple int operations this will obviously fail 2) that won't help for i686 though, because we need -msse2 added to options for it to work; that is normally added by check_vect_support_and_set_flags only when in vect.exp. If it was just that target, we could add dg-additional-options, but I'm afraid many other targets add some options. The following works for me, calling it nodump-* ensures that -fdump-tree-* isn't added, which I believe is essential for the testcase; tested on x86_64-linux with RUNTESTFLAGS='--target_board=unix\{-m32,-m32/-mno-sse,-m64\} vect.exp=nodump*' ok for trunk? Sadly I don't have your broken development version of the patch, so can't verify it fails with the broken patch. 2018-01-03 Jakub Jelinek * gcc.dg/vect-opt-info-1.c: Moved to ... * gcc.dg/vect/nodump-vect-opt-info-1.c: ... here. Only run on vect_int targets, use dg-additional-options instead of dg-options and use relative line numbers instead of absolute. --- gcc/testsuite/gcc.dg/vect-opt-info-1.c.jj 2018-01-03 10:04:47.568412808 +0100 +++ gcc/testsuite/gcc.dg/vect-opt-info-1.c 2018-01-03 22:14:44.082848915 +0100 @@ -1,11 +0,0 @@ -/* { dg-options "-std=c99 -fopt-info -O3" } */ - -void -vadd (int *dst, int *op1, int *op2, int count) -{ - for (int i = 0; i < count; ++i) -dst[i] = op1[i] + op2[i]; -} - -/* { dg-message "loop vectorized" "" { target *-*-* } 6 } */ -/* { dg-message "loop versioned for vectorization because of possible aliasing" "" { target *-*-* } 6 } */ --- gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c.jj 2018-01-03 22:14:49.387852927 +0100 +++ gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c 2018-01-03 22:17:30.437974412 +0100 @@ -0,0 +1,11 @@ +/* { dg-do compile { target vect_int } } */ +/* { dg-additional-options "-std=c99 -fopt-info -O3" } */ + +void +vadd (int *dst, int *op1, int *op2, int count) +{ +/* { dg-message "loop vectorized" "" { target *-*-* } .+2 } */ +/* { dg-message "loop versioned for vectorization because of possible aliasing" "" { target *-*-* } .+1 } */ + for (int i = 0; i < count; ++i) +dst[i] = op1[i] + op2[i]; +} Jakub
[PATCH] Update crtl->has_bb_partition if NOTE_INSN_SWITCH_SECTIONS isn't emitted (PR debug/83585)
Hi! My recent dwarf2out.c, dbxout.c and rs6000.c uses of crtl->has_bb_partition all assume that if it is true then the function actually has 2 partitions and NOTE_INSN_SWITCH_SECTIONS is present, but as the following testcase shows, that isn't always the case, crtl->has_bb_partition is set during bbpart pass if there are 2 partitions, but there are many passes in between and all bbs of one of the partitions could be optimized away. Are there any uses of this flag that would be interested in whether the function had any partitions in the past (I guess that also means BB_PARTITION is not unspecified), rather than whether the function has 2 partitions right now? If yes, instead of this patch we could introduce a new flag in cfun set by insert_section_boundary_note. If not, the following patch should be sufficient. Bootstrapped/regtested on x86_64-linux and i686-linux. 2018-01-03 Jakub JelinekPR debug/83585 * bb-reorder.c (insert_section_boundary_note): Set has_bb_partition to switched_sections. * gcc.dg/pr83585.c: New test. --- gcc/bb-reorder.c.jj 2018-01-03 10:19:55.303533981 +0100 +++ gcc/bb-reorder.c2018-01-03 18:16:32.470555769 +0100 @@ -2523,6 +2523,11 @@ insert_section_boundary_note (void) current_partition = BB_PARTITION (bb); } } + + /* Make sure crtl->has_bb_partition matches reality even if bbpart finds + some hot and some cold basic blocks, but later one of those kinds is + optimized away. */ + crtl->has_bb_partition = switched_sections; } namespace { --- gcc/testsuite/gcc.dg/pr83585.c.jj 2018-01-03 18:20:33.685641817 +0100 +++ gcc/testsuite/gcc.dg/pr83585.c 2018-01-03 18:19:29.922619064 +0100 @@ -0,0 +1,18 @@ +/* PR debug/83585 */ +/* { dg-do assemble } */ +/* { dg-options "-std=gnu89 -O2 -g -fno-tree-dce -fno-guess-branch-probability" } */ + +int +foo (int x) +{ + int a, b; + for (a = 0; a < 2; ++a) +if (x != 0) + { +for (b = 0; b < 2; ++b) + ; +return 0; + } + + return; +} Jakub
Re: [PATCH] Fix ICE with statement frontiers (PR debug/83645)
On January 3, 2018 9:42:01 PM GMT+01:00, Jakub Jelinekwrote: >Hi! > >When var-tracking pass isn't done (e.g. with -O2 -g -fno-var-tracking >or -O2 -gstatement-frontiers), rest_of_handle_final calls >variable_tracking_main in order to perform delete_vta_debug_insns >- replace the debug marker insns with notes. The problem with that is >that final pass is after freeing cfg, so using >FOR_EACH_BB/FOR_BB_INSNS_SAFE >is incorrect as the attached patch shows, e.g. the barriers pass swaps >a >barrier with the statement notes and nothing updates the bb boundaries. > >Fixed by doing the replacements on the raw insn stream when called from >final.c. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for >trunk? OK. Richard. >2018-01-03 Jakub Jelinek > > PR debug/83645 > * var-tracking.c (delete_vta_debug_insn): New inline function. > (delete_vta_debug_insns): Add USE_CFG argument, if true, walk just > insns from get_insns () to NULL instead of each bb separately. > Use delete_vta_debug_insn. No longer static. > (vt_debug_insns_local, variable_tracking_main_1): Adjust > delete_vta_debug_insns callers. > * rtl.h (delete_vta_debug_insns): Declare. > * final.c (rest_of_handle_final): Call delete_vta_debug_insns > instead of variable_tracking_main. > > * gcc.dg/pr83645.c: New test. > >--- gcc/var-tracking.c.jj 2018-01-03 10:19:54.385533834 +0100 >+++ gcc/var-tracking.c 2018-01-03 16:01:48.608217538 +0100 >@@ -10271,11 +10271,40 @@ vt_initialize (void) > > static int debug_label_num = 1; > >+/* Remove from the insn stream a single debug insn used for >+ variable tracking at assignments. */ >+ >+static inline void >+delete_vta_debug_insn (rtx_insn *insn) >+{ >+ if (DEBUG_MARKER_INSN_P (insn)) >+{ >+ reemit_marker_as_note (insn); >+ return; >+} >+ >+ tree decl = INSN_VAR_LOCATION_DECL (insn); >+ if (TREE_CODE (decl) == LABEL_DECL >+ && DECL_NAME (decl) >+ && !DECL_RTL_SET_P (decl)) >+{ >+ PUT_CODE (insn, NOTE); >+ NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL; >+ NOTE_DELETED_LABEL_NAME (insn) >+ = IDENTIFIER_POINTER (DECL_NAME (decl)); >+ SET_DECL_RTL (decl, insn); >+ CODE_LABEL_NUMBER (insn) = debug_label_num++; >+} >+ else >+delete_insn (insn); >+} >+ > /* Remove from the insn stream all debug insns used for variable >- tracking at assignments. */ >+ tracking at assignments. USE_CFG should be false if the cfg is no >+ longer usable. */ > >-static void >-delete_vta_debug_insns (void) >+void >+delete_vta_debug_insns (bool use_cfg) > { > basic_block bb; > rtx_insn *insn, *next; >@@ -10283,33 +10312,20 @@ delete_vta_debug_insns (void) > if (!MAY_HAVE_DEBUG_INSNS) > return; > >- FOR_EACH_BB_FN (bb, cfun) >-{ >- FOR_BB_INSNS_SAFE (bb, insn, next) >+ if (use_cfg) >+FOR_EACH_BB_FN (bb, cfun) >+ { >+ FOR_BB_INSNS_SAFE (bb, insn, next) >+if (DEBUG_INSN_P (insn)) >+ delete_vta_debug_insn (insn); >+ } >+ else >+for (insn = get_insns (); insn; insn = next) >+ { >+ next = NEXT_INSN (insn); > if (DEBUG_INSN_P (insn)) >-{ >- if (DEBUG_MARKER_INSN_P (insn)) >-{ >- reemit_marker_as_note (insn); >- continue; >-} >- >- tree decl = INSN_VAR_LOCATION_DECL (insn); >- if (TREE_CODE (decl) == LABEL_DECL >- && DECL_NAME (decl) >- && !DECL_RTL_SET_P (decl)) >-{ >- PUT_CODE (insn, NOTE); >- NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL; >- NOTE_DELETED_LABEL_NAME (insn) >-= IDENTIFIER_POINTER (DECL_NAME (decl)); >- SET_DECL_RTL (decl, insn); >- CODE_LABEL_NUMBER (insn) = debug_label_num++; >-} >- else >-delete_insn (insn); >-} >-} >+delete_vta_debug_insn (insn); >+ } > } > > /* Run a fast, BB-local only version of var tracking, to take care of >@@ -10322,7 +10338,7 @@ static void > vt_debug_insns_local (bool skipped ATTRIBUTE_UNUSED) > { > /* ??? Just skip it all for now. */ >- delete_vta_debug_insns (); >+ delete_vta_debug_insns (true); > } > > /* Free the data structures needed for variable tracking. */ >@@ -10395,7 +10411,7 @@ variable_tracking_main_1 (void) >any pseudos at this point. */ > || targetm.no_register_allocation) > { >- delete_vta_debug_insns (); >+ delete_vta_debug_insns (true); > return 0; > } > >@@ -10423,7 +10439,7 @@ variable_tracking_main_1 (void) > { > vt_finalize (); > >- delete_vta_debug_insns (); >+ delete_vta_debug_insns (true); > > /* This is later restored by our caller. */ > flag_var_tracking_assignments = 0; >--- gcc/rtl.h.jj 2018-01-03 10:19:55.184533962 +0100 >+++ gcc/rtl.h
Re: [PATCH] Fix ICE during expand_debug_expr (PR debug/83621)
On January 3, 2018 9:46:14 PM GMT+01:00, Jakub Jelinekwrote: >Hi! > >The ICE here is when calling multiple_p with GET_MODE_BITSIZE >(BLKmode), >so SIGFPE because it is 0. BLKmode can leak into the debug stmts >through >generic vectors without HW support, but doing say (plus:BLK ...) etc. >just doesn't look like a valid RTL, so instead of just not trying to >subreg >it and not call multiple_p, this patch punts when unary/binary/ternary >ops have BLKmode. > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Richard. >2018-01-03 Jakub Jelinek > > PR debug/83621 > * cfgexpand.c (expand_debug_expr): Return NULL if mode is > BLKmode for ternary, binary or unary expressions. > > * gcc.dg/pr83621.c: New test. > >--- gcc/cfgexpand.c.jj 2018-01-03 10:19:54.0 +0100 >+++ gcc/cfgexpand.c2018-01-03 16:56:28.375179714 +0100 >@@ -4208,6 +4208,8 @@ expand_debug_expr (tree exp) > > binary: > case tcc_binary: >+ if (mode == BLKmode) >+ return NULL_RTX; > op1 = expand_debug_expr (TREE_OPERAND (exp, 1)); > if (!op1) > return NULL_RTX; >@@ -4232,6 +4234,8 @@ expand_debug_expr (tree exp) > > unary: > case tcc_unary: >+ if (mode == BLKmode) >+ return NULL_RTX; > inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0))); > op0 = expand_debug_expr (TREE_OPERAND (exp, 0)); > if (!op0) >--- gcc/testsuite/gcc.dg/pr83621.c.jj 2018-01-03 17:01:21.244249712 >+0100 >+++ gcc/testsuite/gcc.dg/pr83621.c 2018-01-03 17:00:56.414243769 +0100 >@@ -0,0 +1,12 @@ >+/* PR debug/83621 */ >+/* { dg-do compile } */ >+/* { dg-options "-O -g" } */ >+ >+typedef int __attribute__ ((__vector_size__ (64))) V; >+V v; >+ >+void >+foo () >+{ >+ V u = v >> 1; >+} > > Jakub
Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64
On 01/03/2018 01:36 PM, Jakub Jelinek wrote: > On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote: >>> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? >>> >>> 2018-01-03 Jakub Jelinek>>> >>> PR target/83641 >>> * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For >>> noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, >>> only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp >>> and add REG_CFA_ADJUST_CFA notes in that case to both insns. >> I still question how useful this part is, but not enough to object given > > Reduces bloat in .eh_frame and when unwinding less work is needed. The > patch isn't that large after all. > >> you've done the work. I'll go ahead and commit both as a single unit. > > BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and > my patch, the only regression caused by your patch is: > +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi > +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi > on i686-linux, quite understandably, that is exactly the case you are > changing, there is %edi clobbered in the function and thus needs to be saved > and restored and so no push/pop of %esi is needed. > So, this testcase should be tweaked. I think the ASM can just be dropped. It doesn't contribute anything to what Florian was trying to show. Once we drop the ASM we should see those probes return. I'll have to check that obviously. jeff
[PATCH] Fix ICE during expand_debug_expr (PR debug/83621)
Hi! The ICE here is when calling multiple_p with GET_MODE_BITSIZE (BLKmode), so SIGFPE because it is 0. BLKmode can leak into the debug stmts through generic vectors without HW support, but doing say (plus:BLK ...) etc. just doesn't look like a valid RTL, so instead of just not trying to subreg it and not call multiple_p, this patch punts when unary/binary/ternary ops have BLKmode. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-01-03 Jakub JelinekPR debug/83621 * cfgexpand.c (expand_debug_expr): Return NULL if mode is BLKmode for ternary, binary or unary expressions. * gcc.dg/pr83621.c: New test. --- gcc/cfgexpand.c.jj 2018-01-03 10:19:54.0 +0100 +++ gcc/cfgexpand.c 2018-01-03 16:56:28.375179714 +0100 @@ -4208,6 +4208,8 @@ expand_debug_expr (tree exp) binary: case tcc_binary: + if (mode == BLKmode) + return NULL_RTX; op1 = expand_debug_expr (TREE_OPERAND (exp, 1)); if (!op1) return NULL_RTX; @@ -4232,6 +4234,8 @@ expand_debug_expr (tree exp) unary: case tcc_unary: + if (mode == BLKmode) + return NULL_RTX; inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0))); op0 = expand_debug_expr (TREE_OPERAND (exp, 0)); if (!op0) --- gcc/testsuite/gcc.dg/pr83621.c.jj 2018-01-03 17:01:21.244249712 +0100 +++ gcc/testsuite/gcc.dg/pr83621.c 2018-01-03 17:00:56.414243769 +0100 @@ -0,0 +1,12 @@ +/* PR debug/83621 */ +/* { dg-do compile } */ +/* { dg-options "-O -g" } */ + +typedef int __attribute__ ((__vector_size__ (64))) V; +V v; + +void +foo () +{ + V u = v >> 1; +} Jakub
[PATCH] Fix ICE with statement frontiers (PR debug/83645)
Hi! When var-tracking pass isn't done (e.g. with -O2 -g -fno-var-tracking or -O2 -gstatement-frontiers), rest_of_handle_final calls variable_tracking_main in order to perform delete_vta_debug_insns - replace the debug marker insns with notes. The problem with that is that final pass is after freeing cfg, so using FOR_EACH_BB/FOR_BB_INSNS_SAFE is incorrect as the attached patch shows, e.g. the barriers pass swaps a barrier with the statement notes and nothing updates the bb boundaries. Fixed by doing the replacements on the raw insn stream when called from final.c. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-01-03 Jakub JelinekPR debug/83645 * var-tracking.c (delete_vta_debug_insn): New inline function. (delete_vta_debug_insns): Add USE_CFG argument, if true, walk just insns from get_insns () to NULL instead of each bb separately. Use delete_vta_debug_insn. No longer static. (vt_debug_insns_local, variable_tracking_main_1): Adjust delete_vta_debug_insns callers. * rtl.h (delete_vta_debug_insns): Declare. * final.c (rest_of_handle_final): Call delete_vta_debug_insns instead of variable_tracking_main. * gcc.dg/pr83645.c: New test. --- gcc/var-tracking.c.jj 2018-01-03 10:19:54.385533834 +0100 +++ gcc/var-tracking.c 2018-01-03 16:01:48.608217538 +0100 @@ -10271,11 +10271,40 @@ vt_initialize (void) static int debug_label_num = 1; +/* Remove from the insn stream a single debug insn used for + variable tracking at assignments. */ + +static inline void +delete_vta_debug_insn (rtx_insn *insn) +{ + if (DEBUG_MARKER_INSN_P (insn)) +{ + reemit_marker_as_note (insn); + return; +} + + tree decl = INSN_VAR_LOCATION_DECL (insn); + if (TREE_CODE (decl) == LABEL_DECL + && DECL_NAME (decl) + && !DECL_RTL_SET_P (decl)) +{ + PUT_CODE (insn, NOTE); + NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL; + NOTE_DELETED_LABEL_NAME (insn) + = IDENTIFIER_POINTER (DECL_NAME (decl)); + SET_DECL_RTL (decl, insn); + CODE_LABEL_NUMBER (insn) = debug_label_num++; +} + else +delete_insn (insn); +} + /* Remove from the insn stream all debug insns used for variable - tracking at assignments. */ + tracking at assignments. USE_CFG should be false if the cfg is no + longer usable. */ -static void -delete_vta_debug_insns (void) +void +delete_vta_debug_insns (bool use_cfg) { basic_block bb; rtx_insn *insn, *next; @@ -10283,33 +10312,20 @@ delete_vta_debug_insns (void) if (!MAY_HAVE_DEBUG_INSNS) return; - FOR_EACH_BB_FN (bb, cfun) -{ - FOR_BB_INSNS_SAFE (bb, insn, next) + if (use_cfg) +FOR_EACH_BB_FN (bb, cfun) + { + FOR_BB_INSNS_SAFE (bb, insn, next) + if (DEBUG_INSN_P (insn)) + delete_vta_debug_insn (insn); + } + else +for (insn = get_insns (); insn; insn = next) + { + next = NEXT_INSN (insn); if (DEBUG_INSN_P (insn)) - { - if (DEBUG_MARKER_INSN_P (insn)) - { - reemit_marker_as_note (insn); - continue; - } - - tree decl = INSN_VAR_LOCATION_DECL (insn); - if (TREE_CODE (decl) == LABEL_DECL - && DECL_NAME (decl) - && !DECL_RTL_SET_P (decl)) - { - PUT_CODE (insn, NOTE); - NOTE_KIND (insn) = NOTE_INSN_DELETED_DEBUG_LABEL; - NOTE_DELETED_LABEL_NAME (insn) - = IDENTIFIER_POINTER (DECL_NAME (decl)); - SET_DECL_RTL (decl, insn); - CODE_LABEL_NUMBER (insn) = debug_label_num++; - } - else - delete_insn (insn); - } -} + delete_vta_debug_insn (insn); + } } /* Run a fast, BB-local only version of var tracking, to take care of @@ -10322,7 +10338,7 @@ static void vt_debug_insns_local (bool skipped ATTRIBUTE_UNUSED) { /* ??? Just skip it all for now. */ - delete_vta_debug_insns (); + delete_vta_debug_insns (true); } /* Free the data structures needed for variable tracking. */ @@ -10395,7 +10411,7 @@ variable_tracking_main_1 (void) any pseudos at this point. */ || targetm.no_register_allocation) { - delete_vta_debug_insns (); + delete_vta_debug_insns (true); return 0; } @@ -10423,7 +10439,7 @@ variable_tracking_main_1 (void) { vt_finalize (); - delete_vta_debug_insns (); + delete_vta_debug_insns (true); /* This is later restored by our caller. */ flag_var_tracking_assignments = 0; --- gcc/rtl.h.jj2018-01-03 10:19:55.184533962 +0100 +++ gcc/rtl.h 2018-01-03 15:56:13.089080504 +0100 @@ -4254,6 +4254,7 @@ extern GTY(()) rtx stack_limit_rtx; /* In var-tracking.c */ extern unsigned int variable_tracking_main (void); +extern void
Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64
On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote: > > Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? > > > > 2018-01-03 Jakub Jelinek> > > > PR target/83641 > > * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For > > noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, > > only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp > > and add REG_CFA_ADJUST_CFA notes in that case to both insns. > I still question how useful this part is, but not enough to object given Reduces bloat in .eh_frame and when unwinding less work is needed. The patch isn't that large after all. > you've done the work. I'll go ahead and commit both as a single unit. BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and my patch, the only regression caused by your patch is: +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi on i686-linux, quite understandably, that is exactly the case you are changing, there is %edi clobbered in the function and thus needs to be saved and restored and so no push/pop of %esi is needed. So, this testcase should be tweaked. Jakub
Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64
On 01/03/2018 04:57 AM, Jakub Jelinek wrote: > On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote: >> It's fairly obvious that the probe of *sp isn't actually necessary here >> because the register saves in the prologue act as probe points for *sp. >> >> In fact, the only way this can ever cause problems is if %esi is used in >> the body in which case it would have been callee saved in the prologue. >> So if we detect that %esi is already callee saved in the prologue then >> we could eliminate the explicit probe of *sp. >> >> But we can do even better. If any register is saved in the prologue, >> then that callee register save functions as an implicit probe of *sp and >> we do not need to explicitly probe *sp. >> >> While this was reported with -m32, I'm pretty sure we can trigger a >> similar issue on x86_64. >> >> Bootstrapped and regression tested on x86_64. Also verified the >> testcase behavior on -m32. The test uses flags to hopefully ensure >> expected behavior on x86/Solaris, but I did not explicitly test that >> configuration. >> >> OK for the trunk? >> >> Jeff >> > > Missing > PR target/83641 > here. > >> * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not > > adjust as Florian reported. > >> explicitly probe *sp in a noreturn function if there were any callee >> register saves. > > " or frame pointer is needed" ? > >> >> * gcc.target/i386/stack-check-17.c: New test > > Missing full stop after New test > > This is a nice optimization, ok for trunk. > > I think we do not want to put anything into the CFI even if the condition > you've added doesn't trigger, that is a pure space and runtime cycle waste, > except for noting the sp change if it is the current CFA. Even after the > push the register value still holds the caller's value, and after the pop > too. ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't > want to do, if dummy_reg happened to be a drap reg, we'd emit really weird > stuff, the cfa restore, etc. There are many other spots that gen_pop > themselves and do the appropriate thing at that spot, instead of > all using ix86_emit_restore_reg_using_pop. > > Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? > > 2018-01-03 Jakub Jelinek> > PR target/83641 > * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For > noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, > only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp > and add REG_CFA_ADJUST_CFA notes in that case to both insns. I still question how useful this part is, but not enough to object given you've done the work. I'll go ahead and commit both as a single unit. Jeff
Re: [v3 PATCH] Protect optional's deduction guide with the feature macro
On 03/01/18 16:34 +0200, Ville Voutilainen wrote: Tested partially on Linux-x64, finishing testing the full suite on Linux-PPC64. Ok for trunk? Yes, and gcc-7-branch please. Thanks.
Re: Fix Bug 83566 - cyl_bessel_j returns wrong result for x>1000 for high orders
Hi. On 01/02/2018 05:43 PM, Michele Pezzutti wrote: On 01/02/2018 02:28 AM, Ed Smith-Rowland wrote: I like the patch. I have a similar one in the tr29124 branch. Anyway, I got held up and I think it's good to have new folks looking into this. It looks good except that you need to uglify k. I looked at the GSL implementation, based on same reference, and their loop is cleaner. What about porting that implementation here? Possible? My implementation is also using one more term for P than for Q, which is discouraged in GSL, according to their comments. Ed, do you have any comment about this point? Regarding the 2nd line, after my observations, usually term stops contributing to P, Q before k > nu/2, so actually, an offset by one is most likely without consequences. GSL implementation is nevertheless more elegant. Also, keep in mind that these series are asymptotic - they'll eventually blow up. You should watch the magnitude of sequential terms and bail out of the loop if either term starts growing. Ed Yes, you are right. As nu increases, the multiplying '__term' gets larger, and the result will lose accuracy. This cannot be used for large nu. I have not been able to figure out how to detect when it starts to lose accuracy though, other than empirically. GSL has no checks on terms and uses this method only under certain conditions, on nu and x. Otherwise, they use other methods.
Re: [PATCH] PR 78534 Change character length from int to size_t
On Wed, Jan 3, 2018 at 8:34 PM, Bob Deenwrote: > On 12/29/17 5:31 AM, Janne Blomqvist wrote: >> >> In order to handle large character lengths on (L)LP64 targets, switch >> the GFortran character length from an int to a size_t. >> >> This is an ABI change, as procedures with character arguments take >> hidden arguments with the character length. > > > Did this change not make it into gcc 7 then? No, it caused regressions on big endian targets, and it was so late in the gcc 7 cycle that there was no time to fix them (at the time I didn't have access to a big endian target to test on, and getting said access took time), so I had to revert it. Those regressions have now been fixed, and the ABI has been broken anyway due to other changes, so I'm trying again for gcc 8. > I am one of those who still > use these hidden arguments for Fortran <-> C interfaces. Based on > discussions a year ago, I added this to my code: > > #if defined(__GNUC__) && (__GNUC__ > 6) > #include > #define FORSTR_STDARG_TYPE size_t > #else > #define FORSTR_STDARG_TYPE int > #endif > > I infer from this thread that I should change this to __GNUC__ > 7 now. Is > this still the correct/best way to determine the hidden argument size? Yes, I would say so. > (note that we're still using 4.x... ugh, don't ask... so the >6 check hasn't > actually been used yet, I just want to future-proof things as much as > possible without having to rewrite the entire Fortran <-> C interface.) > > Thanks... > > -Bob > > Bob Deen @ NASA-JPL Multmission Image Processing Lab > bob.d...@jpl.nasa.gov > -- Janne Blomqvist
[committed] Fix warning in gcc.dg/plugin/expensive_selftests_plugin.c with !CHECKING_P
On Tue, 2018-01-02 at 21:25 +0100, Andreas Schwab wrote: > /daten/gcc/gcc- > 20180101/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c:175 > :1: warning: no return statement in function returning non-void [- > Wreturn-type] > > Andreas. Thanks. I forgot to handle the --enable-checking=release case here; sorry. I've committed the following patch to trunk to fix this (as r256183), under the "obvious fixes" rule, borrowing the note from toplev::run_self_tests to give the dg-regexp something to look for for the !CHECKING_P case. Tested with and without --enable-checking=release. gcc/testsuite/ChangeLog: * gcc.dg/plugin/expensive-selftests-1.c: Update regexp to handle the !CHECKING_P case by expecting a note. * gcc.dg/plugin/expensive_selftests_plugin.c (plugin_init): Issue a note for the !CHECKING_P case, and move the return statement outside of #if CHECKING_P guard. --- gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c | 2 +- gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c b/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c index e464117..64f168d 100644 --- a/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c +++ b/gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c @@ -1,3 +1,3 @@ int not_empty; -/* { dg-regexp "expensive_selftests_plugin: .* pass\\(es\\) in .* seconds" } */ +/* { dg-regexp "expensive_selftests_plugin: .* pass\\(es\\) in .* seconds|not enabled in this build" } */ diff --git a/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c b/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c index 9470764..a7c6728 100644 --- a/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c +++ b/gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c @@ -170,6 +170,8 @@ plugin_init (struct plugin_name_args *plugin_info, PLUGIN_FINISH, selftest::expensive_tests, NULL); /* void *user_data */ - return 0; +#else + inform (UNKNOWN_LOCATION, "self-tests are not enabled in this build"); #endif /* #if CHECKING_P */ + return 0; } -- 1.8.5.3
Re: PR83648
On 01/02/2018 11:03 PM, Prathamesh Kulkarni wrote: > Hi, > malloc_candidate_p() in ipa-pure-const misses detecting that a > function is malloc-like if the return value is result of PHI and one > of the arguments of PHI is 0. > For example: > > void g(unsigned n) > { > return (n) ? __builtin_malloc (n) : 0; > } > > The reason is that the following check: > if (TREE_CODE (arg) != SSA_NAME) > DUMP_AND_RETURN ("phi arg is not SSA_NAME.") > > fails for arg with constant value 0 and malloc_candidate_p returns false. > The patch simply skips the arg if it equals null_pointer_node. > Does it look OK ? > > One concern I have is that with the patch, malloc_candidate_p will > return true if all the args to PHI are NULL: > retval = PHI<0, 0> > return retval > > However I expect that PHI with all 0 args would be constant folded to > 0 earlier, so this case shouldn't occur in practice ? > > Bootstrapped+tested on x86_64-unknown-linux-gnu. > Cross-testing on arm*-*-* and aarch64*-*-* in progress. A degenerate PHI should be folded/propagated, but you can't absolutely depend on it -- ie, you must handle it gracefully. I think the way to do that here is verify that at least one argument is an SSA_NAME. jeff
Re: [PATCH] PR 78534 Change character length from int to size_t
On 12/29/17 5:31 AM, Janne Blomqvist wrote: In order to handle large character lengths on (L)LP64 targets, switch the GFortran character length from an int to a size_t. This is an ABI change, as procedures with character arguments take hidden arguments with the character length. Did this change not make it into gcc 7 then? I am one of those who still use these hidden arguments for Fortran <-> C interfaces. Based on discussions a year ago, I added this to my code: #if defined(__GNUC__) && (__GNUC__ > 6) #include #define FORSTR_STDARG_TYPE size_t #else #define FORSTR_STDARG_TYPE int #endif I infer from this thread that I should change this to __GNUC__ > 7 now. Is this still the correct/best way to determine the hidden argument size? (note that we're still using 4.x... ugh, don't ask... so the >6 check hasn't actually been used yet, I just want to future-proof things as much as possible without having to rewrite the entire Fortran <-> C interface.) Thanks... -Bob Bob Deen @ NASA-JPL Multmission Image Processing Lab bob.d...@jpl.nasa.gov
[PATCH][committed][ PR middle-end/83654] Fix bogus probe in dynamic space when there are no residuals
There's a of sense of deja-vu on this BZ. There's a lot of similarities with an issue that came up earlier on aarch64 and its special handling of the outgoing argument area. Basically the residual of dynamic allocation may be a compile time constant or a runtime variant and anything in between. In the case where it is not a compile time constant, but has the value zero at runtime we must not probe the non-existent residual. So the way to address that is to emit the compare/branch around the probe when the residual is not a compile time constant (just like we do for the outgoing argument area on aarch64). I took the opportunity to make the residual handling and aarch64 outgoing argument space handling have the same core style. There's two tests. One is the original from Florian. It's a bit interesting in that at expansion time we do not see the allocation size or residual as a constant. So we emit a probing loop and residual handling during expansion. We check for that. Later optimizers come along and prove the residual handling isn't needed which we verify as well by checking the assembly output. In theory we should realize the loop iterates precisely once and remove the looping structure, but don't (I haven't investigated that missed-optimization). I've added a derived test where the size is not compile-time determinable. We verify there'll be loop and residual probing during expansion. We then verify the existence of both probe instructions *and* verify there's 3 conditionals (one guarding entrance to the probing loop, one for the probing loop backedge and one guarding the residual probe) in the assembly output. Bootstrapped and regression tested on x86_64. Verified the test works on both x86 and x86_64. Installing on the trunk. Jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d95c297e96a..f570eb4e606 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,11 @@ +2017-01-03 Jeff Law+ + PR middle-end/83654 + * explow.c (anti_adjust_stack_and_probe_stack_clash): Test a + non-constant residual for zero at runtime and avoid probing in + that case. Reorganize code for trailing problem to mirror handling + of the residual. + 2018-01-03 Prathamesh Kulkarni PR tree-optimization/83501 diff --git a/gcc/explow.c b/gcc/explow.c index b6c56602152..042e71904ec 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -1997,11 +1997,27 @@ anti_adjust_stack_and_probe_stack_clash (rtx size) if (residual != CONST0_RTX (Pmode)) { + rtx label = NULL_RTX; + /* RESIDUAL could be zero at runtime and in that case *sp could +hold live data. Furthermore, we do not want to probe into the +red zone. + +Go ahead and just guard the probe at *sp on RESIDUAL != 0 at +runtime if RESIDUAL is not a compile time constant. */ + if (!CONST_INT_P (residual)) + { + label = gen_label_rtx (); + emit_cmp_and_jump_insns (residual, CONST0_RTX (GET_MODE (residual)), + EQ, NULL_RTX, Pmode, 1, label); + } + rtx x = force_reg (Pmode, plus_constant (Pmode, residual, -GET_MODE_SIZE (word_mode))); anti_adjust_stack (residual); emit_stack_probe (gen_rtx_PLUS (Pmode, stack_pointer_rtx, x)); emit_insn (gen_blockage ()); + if (!CONST_INT_P (residual)) + emit_label (label); } /* Some targets make optimistic assumptions in their prologues about @@ -2014,28 +2030,20 @@ anti_adjust_stack_and_probe_stack_clash (rtx size) live data. Furthermore, we don't want to probe into the red zone. -Go ahead and just guard a probe at *sp on SIZE != 0 at runtime +Go ahead and just guard the probe at *sp on SIZE != 0 at runtime if SIZE is not a compile time constant. */ - - /* Ideally we would just probe at *sp. However, if SIZE is not -a compile-time constant, but is zero at runtime, then *sp -might hold live data. So probe at *sp if we know that -an allocation was made, otherwise probe into the red zone -which is obviously undesirable. */ - if (CONST_INT_P (size)) - { - emit_stack_probe (stack_pointer_rtx); - emit_insn (gen_blockage ()); - } - else + rtx label = NULL_RTX; + if (!CONST_INT_P (size)) { - rtx label = gen_label_rtx (); + label = gen_label_rtx (); emit_cmp_and_jump_insns (size, CONST0_RTX (GET_MODE (size)), EQ, NULL_RTX, Pmode, 1, label); - emit_stack_probe (stack_pointer_rtx); - emit_insn (gen_blockage ()); - emit_label (label); } + + emit_stack_probe (stack_pointer_rtx); + emit_insn (gen_blockage ()); + if (!CONST_INT_P (size)) + emit_label (label); } } diff --git
Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)
On Wed, Jan 03, 2018 at 11:45:55AM -0500, Nathan Sidwell wrote: > On 01/03/2018 11:31 AM, Marek Polacek wrote: > > Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR > > with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI. > > > > The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit. > > But that code now also uses poly_uint64 and I'm not sure if any of the > > constexpr.c code should use it, too. But this patch fixes the ICE. > > This doesn't look right to me, but it doesn't help that the test case > invokes UB. Hmm, like I said, I just followed fold_indirect_ref_1 so I was pretty confident that this is not a totally wrong thing to do ;). > > +typedef int V __attribute__ ((__vector_size__ (16))); > > +V a; > > + > > +int > > +main () > > +{ > > + reinterpret_cast ()[-1] += 1; > > +} > > one could make this code well formed with (I think) > > typedef int V __attribute__ ((__vector_size__ (16))); > V a[2]; > > int main () > { >return reinterpret_cast ([1])[-1]; > } > > That should access the final element of the a[0] vector. Unfortunately, this doesn't trigger the ICE. Marek
Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)
On Wed, Jan 03, 2018 at 05:40:32PM +, Richard Sandiford wrote: > Marek Polacekwrites: > > Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR > > with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI. > > > > The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit. > > But that code now also uses poly_uint64 and I'm not sure if any of the > > constexpr.c code should use it, too. > > The frontend isn't supposed to see any poly_ints (yet), so it's OK to > keep on using INTEGER_CST accessors like tree_to_shwi and tree_to_uhwi. Thanks, that's good to know. Marek
Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)
Marek Polacekwrites: > Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR > with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI. > > The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit. > But that code now also uses poly_uint64 and I'm not sure if any of the > constexpr.c code should use it, too. The frontend isn't supposed to see any poly_ints (yet), so it's OK to keep on using INTEGER_CST accessors like tree_to_shwi and tree_to_uhwi. Thanks, Richard
[PATCH 5/5][AArch64] fp16fml support
Hi All, This patch adds support for the FP16 multiply add/subtract instructions in Armv8.4-a. Support for the new instructions is in the form of new ACLE intrinsics. A new command line feature modifier, +fp16fml, is added to enable the support. Enabling +fp16fml automatically enables +fp16. Test cases were added to verify that the ACLE Intrinsics generate the appropriate FP16 multiply add/subtract assembly instructions. Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all instructions assembly correctly. Okay for trunk? 2017-11-10 Michael Collison* config/aarch64/aarch64-modes.def (V2HF): New VECTOR_MODE. * config/aarch64/aarch64-option-extension.def: Add AARCH64_OPT_EXTENSION of 'fp16fml'. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_FP16_FML): Define if TARGET_F16FML is true. * config/aarch64/predicates.md (aarch64_lane_imm3): New predicate. * config/aarch64/constraints.md (Ui7): New constraint. * config/aarch64/iterators.md (VFMLA_W): New mode iterator. (VFMLA_SEL_W): Ditto. (f16quad): Ditto. (f16mac1): Ditto. (VFMLA16_LOW): New int iterator. (VFMLA16_HIGH): Ditto. (UNSPEC_FMLAL): New unspec. (UNSPEC_FMLSL): Ditto. (UNSPEC_FMLAL2): Ditto. (UNSPEC_FMLSL2): Ditto. (f16mac): New code attribute. * config/aarch64/aarch64-simd-builtins.def (aarch64_fmlal_lowv2sf): Ditto. (aarch64_fmlsl_lowv2sf): Ditto. (aarch64_fmlalq_lowv4sf): Ditto. (aarch64_fmlslq_lowv4sf): Ditto. (aarch64_fmlal_highv2sf): Ditto. (aarch64_fmlsl_highv2sf): Ditto. (aarch64_fmlalq_highv4sf): Ditto. (aarch64_fmlslq_highv4sf): Ditto. (aarch64_fmlal_lane_lowv2sf): Ditto. (aarch64_fmlsl_lane_lowv2sf): Ditto. (aarch64_fmlal_laneq_lowv2sf): Ditto. (aarch64_fmlsl_laneq_lowv2sf): Ditto. (aarch64_fmlalq_lane_lowv4sf): Ditto. (aarch64_fmlsl_lane_lowv4sf): Ditto. (aarch64_fmlalq_laneq_lowv4sf): Ditto. (aarch64_fmlsl_laneq_lowv4sf): Ditto. (aarch64_fmlal_lane_highv2sf): Ditto. (aarch64_fmlsl_lane_highv2sf): Ditto. (aarch64_fmlal_laneq_highv2sf): Ditto. (aarch64_fmlsl_laneq_highv2sf): Ditto. (aarch64_fmlalq_lane_highv4sf): Ditto. (aarch64_fmlsl_lane_highv4sf): Ditto. (aarch64_fmlalq_laneq_highv4sf): Ditto. (aarch64_fmlsl_laneq_highv4sf): Ditto. * config/aarch64/aarch64-simd.md: (aarch64_fmll_low): New pattern. (aarch64_fmll_high): Ditto. (aarch64_simd_fmll_low): Ditto. (aarch64_simd_fmll_high): Ditto. (aarch64_fmll_lane_lowv2sf): Ditto. (aarch64_fmll_lane_highv2sf): Ditto. (aarch64_simd_fmll_lane_lowv2sf): Ditto. (aarch64_simd_fmll_lane_highv2sf): Ditto. (aarch64_fmllq_laneq_lowv4sf): Ditto. (aarch64_fmllq_laneq_highv4sf): Ditto. (aarch64_simd_fmllq_laneq_lowv4sf): Ditto. (aarch64_simd_fmllq_laneq_highv4sf): Ditto. (aarch64_fmll_laneq_lowv2sf): Ditto. (aarch64_fmll_laneq_highv2sf): Ditto. (aarch64_simd_fmll_laneq_lowv2sf): Ditto. (aarch64_simd_fmll_laneq_highv2sf): Ditto. (aarch64_fmllq_lane_lowv4sf): Ditto. (aarch64_fmllq_lane_highv4sf): Ditto. (aarch64_simd_fmllq_lane_lowv4sf): Ditto. (aarch64_simd_fmllq_lane_highv4sf): Ditto. * config/aarch64/arm_neon.h (vfmlal_low_u32): New intrinsic. (vfmlsl_low_u32): Ditto. (vfmlalq_low_u32): Ditto. (vfmlslq_low_u32): Ditto. (vfmlal_high_u32): Ditto. (vfmlsl_high_u32): Ditto. (vfmlalq_high_u32): Ditto. (vfmlslq_high_u32): Ditto. (vfmlal_lane_low_u32): Ditto. (vfmlsl_lane_low_u32): Ditto. (vfmlal_laneq_low_u32): Ditto. (vfmlsl_laneq_low_u32): Ditto. (vfmlalq_lane_low_u32): Ditto. (vfmlslq_lane_low_u32): Ditto. (vfmlalq_laneq_low_u32): Ditto. (vfmlslq_laneq_low_u32): Ditto. (vfmlal_lane_high_u32): Ditto. (vfmlsl_lane_high_u32): Ditto. (vfmlal_laneq_high_u32): Ditto. (vfmlsl_laneq_high_u32): Ditto. (vfmlalq_lane_high_u32): Ditto. (vfmlslq_lane_high_u32): Ditto. (vfmlalq_laneq_high_u32): Ditto. (vfmlslq_laneq_high_u32): Ditto. * config/aarch64/aarch64.h (AARCH64_FL_F16SML): New flag. (AARCH64_FL_FOR_ARCH8_4): New. (AARCH64_ISA_F16FML): New ISA flag. (TARGET_F16FML): New feature flag for fp16fml. gcc.target/aarch64/fp16_fmul_high_1.c: New testcase. gcc.target/aarch64/fp16_fmul_high_2.c: New testcase. gcc.target/aarch64/fp16_fmul_high_3.c: New testcase. gcc.target/aarch64/fp16_fmul_high.h: New shared testcase. gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase.
[PATCH 4/5][AArch64] Crypto sha512 and sha3
Hi All, This patch adds support for the SHA-512 and SHA-3 instructions added in Armv8.4-a. Support for the new instructions is in the form of new ACLE intrinsics. A new command line feature modifier, +sha3, is added to enable the support. Test cases were added to verify that the ACLE Intrinsics generate the appropriate SHA-512/SHA-3 assembly instructions. Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all instructions assembly correctly. Okay for trunk? 2017-11-10 Michael Collison* config/aarch64/aarch64-builtins.c: (aarch64_types_ternopu_imm_qualifiers, TYPES_TERNOPUI): New. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_SHA3): Define if TARGET_SHA3 is true. * config/aarch64/aarch64.h (AARCH64_FL_SHA3): New flags. (AARCH64_ISA_SHA3): New ISA flag. (TARGET_SHA3): New feature flag for sha3. * config/aarch64/iterators.md (sha512_op): New int attribute. (CRYPTO_SHA512): New int iterator. (UNSPEC_SHA512H): New unspec. (UNSPEC_SHA512H2): Ditto. (UNSPEC_SHA512SU0): Ditto. (UNSPEC_SHA512SU1): Ditto. * config/aarch64/aarch64-simd-builtins.def (aarch64_crypto_sha512hqv2di): New builtin. (aarch64_crypto_sha512h2qv2di): Ditto. (aarch64_crypto_sha512su0qv2di): Ditto. (aarch64_crypto_sha512su1qv2di): Ditto. (aarch64_eor3qv8hi): Ditto. (aarch64_rax1qv2di): Ditto. (aarch64_xarqv2di): Ditto. (aarch64_bcaxqv8hi): Ditto. * config/aarch64/aarch64-simd.md: (aarch64_crypto_sha512hqv2di): New pattern. (aarch64_crypto_sha512su0qv2di): Ditto. (aarch64_crypto_sha512su1qv2di): Ditto. (aarch64_eor3qv8hi): Ditto. (aarch64_rax1qv2di): Ditto. (aarch64_xarqv2di): Ditto. (aarch64_bcaxqv8hi): Ditto. * config/aarch64/arm_neon.h (vsha512hq_u64): New intrinsic. (vsha512h2q_u64): Ditto. (vsha512su0q_u64): Ditto. (vsha512su1q_u64): Ditto. (veor3q_u16): Ditto. (vrax1q_u64): Ditto. (vxarq_u64): Ditto. (vbcaxq_u16): Ditto. * config/arm/types.md (crypto_sha512): New type attribute. (crypto_sha3): Ditto. (doc/invoke.texi): Document new sha3 option. gcc.target/aarch64/sha2.h: New shared testcase. gcc.target/aarch64/sha2_1.c: New testcase. gcc.target/aarch64/sha2_2.c: New testcase. gcc.target/aarch64/sha2_3.c: New testcase. gcc.target/aarch64/sha3.h: New shared testcase. gcc.target/aarch64/sha3_1.c: New testcase. gcc.target/aarch64/sha3_2.c: New testcase. gcc.target/aarch64/sha3_3.c: New testcase. crypto_sha512.patch Description: crypto_sha512.patch
[PATCH 3/5][AArch64] Crypto SM4 Support
Hi All, This patch adds support for the SM3/SM4 cryptographic instructions added in Armv8.4-a. Support for the new instructions is in the form of new ACLE intrinsics. A new command line feature modifier, +sm4, is added to enable the support. Test cases were added to verify that the ACLE Intrinsics generate the appropriate SM3/SM4 assembly instructions. Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all instructions assembly correctly. Okay for trunk? 2017-11-10 Michael Collison* config/aarch64/aarch64-builtins.c: (aarch64_types_quadopu_imm_qualifiers, TYPES_QUADOPUI): New. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_SM3): Define if TARGET_SM4 is true. (__ARM_FEATURE_SM4): Define if TARGET_SM4 is true. * config/aarch64/aarch64.h (AARCH64_FL_SM4): New flags. (AARCH64_ISA_SM4): New ISA flag. (TARGET_SM4): New feature flag for sm4. * config/aarch64/aarch64-simd-builtins.def (aarch64_sm3ss1qv4si): Ditto. (aarch64_sm3tt1aq4si): Ditto. (aarch64_sm3tt1bq4si): Ditto. (aarch64_sm3tt2aq4si): Ditto. (aarch64_sm3tt2bq4si): Ditto. (aarch64_sm3partw1qv4si): Ditto. (aarch64_sm3partw2qv4si): Ditto. (aarch64_sm4eqv4si): Ditto. (aarch64_sm4ekeyqv4si): Ditto. * config/aarch64/aarch64-simd.md: (aarch64_sm3ss1qv4si): Ditto. (aarch64_sm3ttqv4si): Ditto. (aarch64_sm3partwqv4si): Ditto. (aarch64_sm4eqv4si): Ditto. (aarch64_sm4ekeyqv4si): Ditto. * config/aarch64/iterators.md (sm3tt_op): New int iterator. (sm3part_op): Ditto. (CRYPTO_SM3TT): Ditto. (CRYPTO_SM3PART): Ditto. (UNSPEC_SM3SS1): New unspec. (UNSPEC_SM3TT1A): Ditto. (UNSPEC_SM3TT1B): Ditto. (UNSPEC_SM3TT2A): Ditto. (UNSPEC_SM3TT2B): Ditto. (UNSPEC_SM3PARTW1): Ditto. (UNSPEC_SM3PARTW2): Ditto. (UNSPEC_SM4E): Ditto. (UNSPEC_SM4EKEY): Ditto. * config/aarch64/constraints.md (Ui2): New constraint. * config/aarch64/predicates.md (aarch64_imm2): New predicate. * config/arm/types.md (crypto_sm3): New type attribute. (crypto_sm4): Ditto. * config/aarch64/arm_neon.h (vsm3ss1q_u32): New intrinsic. (vsm3tt1aq_u32): Ditto. (vsm3tt1bq_u32): Ditto. (vsm3tt2aq_u32): Ditto. (vsm3tt2bq_u32): Ditto. (vsm3partw1q_u32): Ditto. (vsm3partw2q_u32): Ditto. (vsm4eq_u32): Ditto. (vsm4ekeyq_u32): Ditto. (doc/invoke.texi): Document new sm4 option. gcc.target/aarch64/sm3_sm4.c: New testcase. crypto_sm4.patch Description: crypto_sm4.patch
[PATCH 2/5][AArch64] Add v8.4 architecture
Hi all, This patch adds support for the Arm architecture v8.4. A new command line option, -march=armv8.4-a, is added as well as documentation. Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all instructions assembly correctly. 2017-11-10 Michael Collison* config/aarch64/aarch64-arches.def (armv8.4-a): New architecture. * config/aarch64/aarch64.h (AARCH64_ISA_V8_4): New ISA flag. (AARCH64_FL_FOR_ARCH8_4): New. (AARCH64_FL_V8_4): New flag. (doc/invoke.texi): Document new armv8.4-a option. v8_4_architecture.patch Description: v8_4_architecture.patch
[PATCH 1/5][AArch64] Crypto command line split
Hi all, This patch adds two new command line options for the legacy cryptographic extensions AES (+aes) and SHA-1/SHA-2 (+sha2). Backward compatibility is retained by modifying the +crypto feature modifier to enable +aes and +sha2. Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all instructions assembly correctly. 2017-11-10 Michael Collison* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_AES): Define if TARGET_AES is true. (__ARM_FEATURE_SHA2): Define if TARGET_SHA2 is true. * config/aarch64/aarch64-option-extension.def: Add AARCH64_OPT_EXTENSION of 'sha2'. (aes): Add AARCH64_OPT_EXTENSION of 'aes'. (crypto): Disable sha2 and aes if crypto disabled. (crypto): Enable aes and sha2 if enabled. (simd): Disable sha2 and aes if simd disabled. * config/aarch64/aarch64.h (AARCH64_FL_AES, AARCH64_FL_SHA2): New flags. (AARCH64_ISA_AES, AARCH64_ISA_SHA2): New ISA flags. (TARGET_SHA2): New feature flag for sha2. (TARGET_AES): New feature flag for aes. * config/aarch64/aarch64-simd.md: (aarch64_crypto_aesv16qi): Make pattern conditional on TARGET_AES. (aarch64_crypto_aesv16qi): Ditto. (aarch64_crypto_sha1hsi): Make pattern conditional on TARGET_SHA2. (aarch64_crypto_sha1hv4si): Ditto. (aarch64_be_crypto_sha1hv4si): Ditto. (aarch64_crypto_sha1su1v4si): Ditto. (aarch64_crypto_sha1v4si): Ditto. (aarch64_crypto_sha1su0v4si): Ditto. (aarch64_crypto_sha256hv4si): Ditto. (aarch64_crypto_sha256su0v4si): Ditto. (aarch64_crypto_sha256su1v4si): Ditto. (doc/invoke.texi): Document new aes and sha2 options. crypto_split.patch Description: crypto_split.patch
[PATCH 0/5][AArch64] ARMv8.4-A support
Hello, The ARMv8.4-A architecture builds on ARMv8.3-A and includes optional cryptographic extensions supporting SHA512, SHA3, SM3 and SM4. New FP16 multiply add/subtract instructions have been added that are mandatory in ARMv8.4-A and optional from ARMv8.2-A onward. Although the new cryptographic instructions are introduced in ARMv8.4-A, they may be optionally supported in any architecture implementation from ARMv8.2-A onward. This patch set adds support to GCC for the ARMv8.4-A architecture and adds new command line options to individually select the existing cryptographic SHA-1/SHA-256 and AES extensions. The existing +crypto option is retained for backward compatibility. New command line options are added for SHA512/SHA3, SM3/SM4 and the FP16 multiply add/subtract extensions. The cryptographic and FP16 multiply add/subtract instructions are exposed as new ACLE intrinsics. The patches in this series are: - Add new command line options for SHA-1/SHA-256 and AES including documentation - Add support for ARMv8.4-A - Add support for the cryptographic SM3/SM4 extension - Add support for the cryptographic SHA-512/SHA-3 extension - Add support for the FP16 multiply add/subtract extension
Re: C++ PATCH to fix ICE with vector expr folding (PR c++/83659)
On 01/03/2018 11:31 AM, Marek Polacek wrote: Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI. The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit. But that code now also uses poly_uint64 and I'm not sure if any of the constexpr.c code should use it, too. But this patch fixes the ICE. This doesn't look right to me, but it doesn't help that the test case invokes UB. +typedef int V __attribute__ ((__vector_size__ (16))); +V a; + +int +main () +{ + reinterpret_cast ()[-1] += 1; +} one could make this code well formed with (I think) typedef int V __attribute__ ((__vector_size__ (16))); V a[2]; int main () { return reinterpret_cast ([1])[-1]; } That should access the final element of the a[0] vector. nathan -- Nathan Sidwell
C++ PATCH to fix ICE with vector expr folding (PR c++/83659)
Here we are crashing because cxx_fold_indirect_ref got a POINTER_PLUS_EXPR with offset > signed HOST_WIDE_INT and we tried to convert it to sHWI. The matching code in fold_indirect_ref_1 uses uHWIs so I've followed suit. But that code now also uses poly_uint64 and I'm not sure if any of the constexpr.c code should use it, too. But this patch fixes the ICE. Bootstrapped/regtested on x86_64-linux, ok for trunk/7? 2018-01-03 Marek PolacekPR c++/83659 * constexpr.c (cxx_fold_indirect_ref): Use unsigned HOST_WIDE_INT when computing offsets. * g++.dg/torture/pr83659.C: New test. diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c index 1aeacd51810..cf7c994b381 100644 --- gcc/cp/constexpr.c +++ gcc/cp/constexpr.c @@ -3109,9 +3109,10 @@ cxx_fold_indirect_ref (location_t loc, tree type, tree op0, bool *empty_base) && (same_type_ignoring_top_level_qualifiers_p (type, TREE_TYPE (op00type { - HOST_WIDE_INT offset = tree_to_shwi (op01); + unsigned HOST_WIDE_INT offset = tree_to_uhwi (op01); tree part_width = TYPE_SIZE (type); - unsigned HOST_WIDE_INT part_widthi = tree_to_shwi (part_width)/BITS_PER_UNIT; + unsigned HOST_WIDE_INT part_widthi + = tree_to_uhwi (part_width) / BITS_PER_UNIT; unsigned HOST_WIDE_INT indexi = offset * BITS_PER_UNIT; tree index = bitsize_int (indexi); diff --git gcc/testsuite/g++.dg/torture/pr83659.C gcc/testsuite/g++.dg/torture/pr83659.C index e69de29bb2d..d9f709bb520 100644 --- gcc/testsuite/g++.dg/torture/pr83659.C +++ gcc/testsuite/g++.dg/torture/pr83659.C @@ -0,0 +1,11 @@ +// PR c++/83659 +// { dg-do compile } + +typedef int V __attribute__ ((__vector_size__ (16))); +V a; + +int +main () +{ + reinterpret_cast ()[-1] += 1; +} Marek
Re: update config.guess/sub
On 01/02/2018 09:24 PM, Ben Elliston wrote: > It's a new year, time to update these scripts. > > Ben > > > 2018-01-03 Ben Elliston> > * config.guess: Import latest version. > * config.sub: Likewise. Seems reasonable. I did a quick looksie and nothing looked unreasonable. jeff
Re: PR83501
On 2 January 2018 at 19:29, Richard Bienerwrote: > On Thu, Dec 28, 2017 at 8:42 AM, Prathamesh Kulkarni > wrote: >> On 21 December 2017 at 12:53, Prathamesh Kulkarni >> wrote: >>> Hi Jakub, >>> Based on your suggestions in PR83501, I have updated the patch to >>> check for integer_zerop for 2nd operand of mem_ref. >>> >>> With the patch, Warray-bounds started warning for the following test >>> in Warray-bounds-3.c in line 362 and thus I removed xfail on it: >>> TM (a5, "0123", ma.a5 + i, ma.a5); >>> >>> Does it look OK ? >>> Validation in progress. >> ping https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01415.html > > Ok. Thanks, I removed the Warray-bounds-3.c hunk since that was added in interim by Martin Sebor. Committed the attached version in r256180 after verifying bootstrap+test passes on x86_64-unknown-linux-gnu. Thanks, Prathamesh > > RIchard. > >> Thanks, >> Prathamesh >>> >>> Thanks, >>> Prathamesh diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c b/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c new file mode 100644 index 000..d8d3bf6039a --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +char a[4]; + +void f (void) +{ + __builtin_strcpy (a, "abc"); + + if (__builtin_strlen (a) != 3) +__builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not "__builtin_strlen" "strlen" } } */ diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index 8971c7df4f3..f1d4f6ef059 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -2769,6 +2769,21 @@ handle_pointer_plus (gimple_stmt_iterator *gsi) } } +/* Check if RHS is string_cst possibly wrapped by mem_ref. */ +static tree +get_string_cst (tree rhs) +{ + if (TREE_CODE (rhs) == MEM_REF + && integer_zerop (TREE_OPERAND (rhs, 1))) +{ + rhs = TREE_OPERAND (rhs, 0); + if (TREE_CODE (rhs) == ADDR_EXPR) + rhs = TREE_OPERAND (rhs, 0); +} + + return (TREE_CODE (rhs) == STRING_CST) ? rhs : NULL_TREE; +} + /* Handle a single character store. */ static bool @@ -2924,11 +2939,11 @@ handle_char_store (gimple_stmt_iterator *gsi) } } else if (idx == 0 - && TREE_CODE (gimple_assign_rhs1 (stmt)) == STRING_CST + && (rhs = get_string_cst (gimple_assign_rhs1 (stmt))) && ssaname == NULL_TREE && TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE) { - size_t l = strlen (TREE_STRING_POINTER (gimple_assign_rhs1 (stmt))); + size_t l = strlen (TREE_STRING_POINTER (rhs)); HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs)); if (a > 0 && (unsigned HOST_WIDE_INT) a > l) {
[AARCH64]Fix ldr_got_small and ldr_got_small_28k patterns to only allow DImode address.
Hi all, The only allowed addressing mode for aarch64 is DImode, AKA Pmode. ptr_mode could be SImode or DImode depending on the ABI used. This patch here fixes the addressing mode of two patterns as DImode. If any other mode is ever used, somewhere in the compiler might go wrong. aarch64-none-elf regression test Okay without regressions. Okay to commit? Regards, Renlin gcc/ChangeLog: 2018-01-03 Renlin Li* config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Make sure address is Pmode. * config/aarch64/aarch64.md (ldr_got_small): Fix address mode as DImode. (ldr_got_small_28k): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 1da313f..9e9d2ea 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1406,10 +1406,6 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm, rtx s = gen_rtx_SYMBOL_REF (Pmode, "_GLOBAL_OFFSET_TABLE_"); crtl->uses_pic_offset_table = 1; emit_move_insn (gp_rtx, gen_rtx_HIGH (Pmode, s)); - - if (mode != GET_MODE (gp_rtx)) - gp_rtx = gen_lowpart (mode, gp_rtx); - } if (mode == ptr_mode) @@ -1451,13 +1447,19 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm, rtx insn; rtx mem; - rtx tmp_reg = dest; + rtx tmp_reg; machine_mode mode = GET_MODE (dest); if (can_create_pseudo_p ()) - tmp_reg = gen_reg_rtx (mode); + tmp_reg = gen_reg_rtx (Pmode); + else + { + gcc_assert (HARD_REGISTER_P (dest)); + if (mode != Pmode) + tmp_reg = gen_rtx_REG (Pmode, REGNO (dest)); + } - emit_move_insn (tmp_reg, gen_rtx_HIGH (mode, imm)); + emit_move_insn (tmp_reg, gen_rtx_HIGH (Pmode, imm)); if (mode == ptr_mode) { if (mode == DImode) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f1e2a07..a1f2d2d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5407,9 +5407,9 @@ (define_insn "ldr_got_small_" [(set (match_operand:PTR 0 "register_operand" "=r") - (unspec:PTR [(mem:PTR (lo_sum:PTR - (match_operand:PTR 1 "register_operand" "r") - (match_operand:PTR 2 "aarch64_valid_symref" "S")))] + (unspec:PTR [(mem:PTR (lo_sum:DI + (match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "aarch64_valid_symref" "S")))] UNSPEC_GOTSMALLPIC))] "" "ldr\\t%0, [%1, #:got_lo12:%c2]" @@ -5430,9 +5430,9 @@ (define_insn "ldr_got_small_28k_" [(set (match_operand:PTR 0 "register_operand" "=r") - (unspec:PTR [(mem:PTR (lo_sum:PTR - (match_operand:PTR 1 "register_operand" "r") - (match_operand:PTR 2 "aarch64_valid_symref" "S")))] + (unspec:PTR [(mem:PTR (lo_sum:DI + (match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "aarch64_valid_symref" "S")))] UNSPEC_GOTSMALLPIC28K))] "" "ldr\\t%0, [%1, #::%c2]"
[PR c++/83667] Fix tree_dump ICE
This fixes a tree dumping ICE involving static thunk fns. Copying the thunked-to fn's context suffices. nathan -- Nathan Sidwell 2018-01-03 Nathan SidwellPR c++/83667 * method.c (make_alias_for): Copy DECL_CONTEXT. PR c++/83667 * g++.dg/ipa/pr83667.C: New. Index: cp/method.c === --- cp/method.c (revision 256175) +++ cp/method.c (working copy) @@ -206,7 +206,7 @@ make_alias_for (tree target, tree newid) TREE_CODE (target), newid, TREE_TYPE (target)); DECL_LANG_SPECIFIC (alias) = DECL_LANG_SPECIFIC (target); cxx_dup_lang_specific_decl (alias); - DECL_CONTEXT (alias) = NULL; + DECL_CONTEXT (alias) = DECL_CONTEXT (target); TREE_READONLY (alias) = TREE_READONLY (target); TREE_THIS_VOLATILE (alias) = TREE_THIS_VOLATILE (target); TREE_PUBLIC (alias) = 0; Index: testsuite/g++.dg/ipa/pr83667.C === --- testsuite/g++.dg/ipa/pr83667.C (revision 0) +++ testsuite/g++.dg/ipa/pr83667.C (working copy) @@ -0,0 +1,23 @@ +/* { dg-options "-fdump-ipa-inline" } */ +// c++/83667 ICE dumping a static thunk + +struct a +{ + virtual ~a (); +}; + +struct b +{ + virtual void d (...); +}; + +struct c : a, b +{ + void d (...) + { + } +}; + +c c; + +// { dg-final { scan-ipa-dump "summary for void c::\\*.LTHUNK0" "inline" } }
Re: [PATCH] Add version to intermediate gcov file (PR gcov-profile/83669).
On 01/03/2018 08:25 AM, Martin Liška wrote: This is small enhancement reported by users of gcov tool. I'm aware of current stage of GCC, but it's really small change in code. Apart from that, a small fix to documentation is included. yeah, this is useful, thanks. nathan -- Nathan Sidwell
Revert DECL_USER_ALIGN part of r241959
r241959 included code to stop us increasing the alignment of a "user-aligned" variable. This wasn't the main purpose of the patch, and I think it was just there to make the testcase work. The documentation for the aligned attribute says: This attribute specifies a minimum alignment for the variable or structure field, measured in bytes. The DECL_USER_ALIGN code seemed to be treating as a sort of maximum instead, but there's not really such a thing as a maximum here: the variable might still end up at the start of a section that has a higher alignment, or might end up by chance on a "very aligned" boundary at link or load time. I think people who add alignment attributes want to ensure that accesses to that variable are fast, so it seems counter-intuitive for it to make the access slower. The vect-align-4.c test is an example of this: for targets with 128-bit vectors, we get better code without the aligned attribute than we do with it. Tested on aarch64-linux-gnu so far, will test more widely if OK. Thanks, Richard 2018-01-03 Richard Sandifordgcc/ * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Don't punt for user-aligned variables. gcc/testsuite/ * gcc.dg/vect/vect-align-4.c: New test. * gcc.dg/vect/vect-nb-iter-ub-2.c (cc): Remove alignment attribute and redefine as a structure with an unaligned member "b". (foo): Update accordingly. Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.301330558 + +++ gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.454324422 + @@ -920,19 +920,6 @@ vect_compute_data_ref_alignment (struct return true; } - if (DECL_USER_ALIGN (base)) - { - if (dump_enabled_p ()) - { - dump_printf_loc (MSG_NOTE, vect_location, - "not forcing alignment of user-aligned " - "variable: "); - dump_generic_expr (MSG_NOTE, TDF_SLIM, base); - dump_printf (MSG_NOTE, "\n"); - } - return true; - } - /* Force the alignment of the decl. NOTE: This is the only change to the code we make during the analysis phase, before deciding to vectorize the loop. */ Index: gcc/testsuite/gcc.dg/vect/vect-align-4.c === --- /dev/null 2018-01-03 08:32:43.873058927 + +++ gcc/testsuite/gcc.dg/vect/vect-align-4.c2018-01-03 15:03:14.453324462 + @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ + +__attribute__((aligned (8))) int a[2048] = {}; + +void +f1 (void) +{ + for (int i = 0; i < 2048; i++) +a[i]++; +} + +/* { dg-final { scan-tree-dump-not "Vectorizing an unaligned access" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" } } */ Index: gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c === --- gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.301330558 + +++ gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.454324422 + @@ -3,18 +3,19 @@ #include "tree-vect.h" int ii[32]; -char cc[66] __attribute__((aligned(1))) = +struct { char a; char b[66]; } cc = { 0, { 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, -30, 0, 31, 0 }; +30, 0, 31, 0 } +}; void __attribute__((noinline,noclone)) foo (int s) { int i; for (i = 0; i < s; i++) - ii[i] = (int) cc[i*2]; + ii[i] = (int) cc.b[i*2]; } int main (int argc, const char **argv)
Re: [PATCH PR82439][simplify-rtx] Simplify (x | y) == x -> (y & ~x) == 0
Hi On 03/01/18 14:38, Segher Boessenkool wrote: Hi! On Wed, Jan 03, 2018 at 01:57:38PM +, Sudakshina Das wrote: This patch add support for the missing transformation of (x | y) == x -> (y & ~x) == 0. Testing done: Checked for regressions on bootstrapped aarch64-none-linux-gnu and arm-none-linux-gnueabihf and added new test cases. Is this ok for trunk? You forgot to include the patch :-) (facepalm) This is the second time this has happened to me! Sorry about this. Attaching now. Sudi Segher diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index e5cfd3d..17536d0 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -5032,34 +5032,38 @@ simplify_relational_operation_1 (enum rtx_code code, machine_mode mode, simplify_gen_binary (XOR, cmp_mode, XEXP (op0, 1), op1)); - /* (eq/ne (and x y) x) simplifies to (eq/ne (and (not y) x) 0), which - can be implemented with a BICS instruction on some targets, or - constant-folded if y is a constant. */ + /* Simplify eq/ne (and/ior x y) x/y) for targets with a BICS instruction or + constant folding if x/y is a constant. */ if ((code == EQ || code == NE) - && op0code == AND - && rtx_equal_p (XEXP (op0, 0), op1) + && (op0code == AND || op0code == IOR) && !side_effects_p (op1) && op1 != CONST0_RTX (cmp_mode)) { - rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1), cmp_mode); - rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0)); + /* Both (eq/ne (and x y) x) and (eq/ne (ior x y) y) simplify to + (eq/ne (and (not y) x) 0). */ + if ((op0code == AND && rtx_equal_p (XEXP (op0, 0), op1)) + || (op0code == IOR && rtx_equal_p (XEXP (op0, 1), op1))) + { + rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1), + cmp_mode); + rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0)); - return simplify_gen_relational (code, mode, cmp_mode, lhs, - CONST0_RTX (cmp_mode)); -} + return simplify_gen_relational (code, mode, cmp_mode, lhs, + CONST0_RTX (cmp_mode)); + } - /* Likewise for (eq/ne (and x y) y). */ - if ((code == EQ || code == NE) - && op0code == AND - && rtx_equal_p (XEXP (op0, 1), op1) - && !side_effects_p (op1) - && op1 != CONST0_RTX (cmp_mode)) -{ - rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0), cmp_mode); - rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1)); + /* Both (eq/ne (and x y) y) and (eq/ne (ior x y) x) simplify to + (eq/ne (and (not x) y) 0). */ + if ((op0code == AND && rtx_equal_p (XEXP (op0, 1), op1)) + || (op0code == IOR && rtx_equal_p (XEXP (op0, 0), op1))) + { + rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0), + cmp_mode); + rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1)); - return simplify_gen_relational (code, mode, cmp_mode, lhs, - CONST0_RTX (cmp_mode)); + return simplify_gen_relational (code, mode, cmp_mode, lhs, + CONST0_RTX (cmp_mode)); + } } /* (eq/ne (bswap x) C1) simplifies to (eq/ne x C2) with C2 swapped. */ diff --git a/gcc/testsuite/gcc.target/aarch64/bics_5.c b/gcc/testsuite/gcc.target/aarch64/bics_5.c new file mode 100644 index 000..b9c2c40 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/bics_5.c @@ -0,0 +1,86 @@ +/* { dg-do run } */ +/* { dg-options "-O2 --save-temps -fno-inline" } */ + +extern void abort (void); + +int +bics_si_test1 (int a, int b, int c) +{ + if ((a | b) == a) +return a; + else +return c; +} + +int +bics_si_test2 (int a, int b, int c) +{ + if ((a | b) == b) +return b; + else +return c; +} + +typedef long long s64; + +s64 +bics_di_test1 (s64 a, s64 b, s64 c) +{ + if ((a | b) == a) +return a; + else +return c; +} + +s64 +bics_di_test2 (s64 a, s64 b, s64 c) +{ + if ((a | b) == b) +return b; + else +return c; +} + +int +main () +{ + int x; + s64 y; + + x = bics_si_test1 (0xf00d, 0xf11f, 0); + if (x != 0) +abort (); + + x = bics_si_test1 (0xf11f, 0xf00d, 0); + if (x != 0xf11f) +abort (); + + x = bics_si_test2 (0xf00d, 0xf11f, 0); + if (x != 0xf11f) +abort (); + + x = bics_si_test2 (0xf11f, 0xf00d, 0); + if (x != 0) +abort (); + + y = bics_di_test1 (0x10001000f00dll, 0x12341000f00dll, 0ll); + if (y != 0) +abort (); + + y = bics_di_test1 (0x12341000f00dll, 0x10001000f00dll, 0ll); + if (y != 0x12341000f00dll) +abort (); + + y = bics_di_test2 (0x10001000f00dll, 0x12341000f00dll, 0ll); + if (y != 0x12341000f00dll) +abort (); + + y = bics_di_test2 (0x12341000f00dll, 0x10001000f00dll, 0ll); + if (y != 0) +abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+" 2 } } */ +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/bics_5.c
Re: [PATCH PR82439][simplify-rtx] Simplify (x | y) == x -> (y & ~x) == 0
Hi! On Wed, Jan 03, 2018 at 01:57:38PM +, Sudakshina Das wrote: > This patch add support for the missing transformation of (x | y) == x -> > (y & ~x) == 0. > Testing done: Checked for regressions on bootstrapped > aarch64-none-linux-gnu and arm-none-linux-gnueabihf and added new test > cases. > Is this ok for trunk? You forgot to include the patch :-) Segher
[v3 PATCH] Protect optional's deduction guide with the feature macro
Tested partially on Linux-x64, finishing testing the full suite on Linux-PPC64. Ok for trunk? 2018-01-03 Ville VoutilainenProtect optional's deduction guide with the feature macro * include/std/optional: Use the feature macro. diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional index e017eed..f7c72b5 100644 --- a/libstdc++-v3/include/std/optional +++ b/libstdc++-v3/include/std/optional @@ -1038,7 +1038,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// @} +#if __cpp_deduction_guides >= 201606 template optional(_Tp) -> optional<_Tp>; +#endif _GLIBCXX_END_NAMESPACE_VERSION } // namespace std
Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)
On Wed, Jan 03, 2018 at 02:02:16PM +0100, Jakub Jelinek wrote: > Though, the above snippet reminds me I should probably also replace: > + expr = build_base_path (MINUS_EXPR, expr, base, > + /*nonnull=*/!sanitize_null_p, complain); > with: > + expr = build_base_path (MINUS_EXPR, expr, base, > + /*nonnull=*/(!sanitize_null_p > +&& flag_delete_null_pointer_checks), > + complain); > > Instead of checking for generated asm without sanitization, I think it will > be easier/more portable to verify no conditionals in *.optimized dump. > Let me repost the updated patch. Here is an updated patch (which just uses flag_delete_null_pointer_checks, because -fsanitize=null disables that option) and with a testcase that makes sure there is no conditional testing b != NULL. Queued for bootstrap/regtest. 2018-01-03 Jakub JelinekPR c++/83555 * typeck.c (build_static_cast_1): For static casts to reference types, call build_base_path with flag_delete_null_pointer_checks as nonnull instead of always false. When -fsanitize=null, call ubsan_maybe_instrument_reference on the NULL reference INTEGER_CST. * cp-gimplify.c (cp_genericize_r): Don't walk subtrees of UBSAN_NULL call if the first argument is INTEGER_CST with REFERENCE_TYPE. * g++.dg/opt/pr83555.C: New test. * g++.dg/ubsan/pr83555.C: New test. --- gcc/cp/typeck.c.jj 2018-01-03 10:20:17.457537524 +0100 +++ gcc/cp/typeck.c 2018-01-03 14:28:46.189830697 +0100 @@ -6943,8 +6943,11 @@ build_static_cast_1 (tree type, tree exp } /* Convert from "B*" to "D*". This function will check that "B" -is not a virtual base of "D". */ - expr = build_base_path (MINUS_EXPR, expr, base, /*nonnull=*/false, +is not a virtual base of "D". Even if we don't have a guarantee +that expr is NULL, if the static_cast is to a reference type, +it is UB if it would be NULL, so omit the non-NULL check. */ + expr = build_base_path (MINUS_EXPR, expr, base, + /*nonnull=*/flag_delete_null_pointer_checks, complain); /* Convert the pointer to a reference -- but then remember that @@ -6955,7 +6958,18 @@ build_static_cast_1 (tree type, tree exp is a variable with the same type, the conversion would get folded away, leaving just the variable and causing lvalue_kind to give the wrong answer. */ - return convert_from_reference (rvalue (cp_fold_convert (type, expr))); + expr = cp_fold_convert (type, expr); + + /* When -fsanitize=null, make sure to diagnose reference binding to +NULL even when the reference is converted to pointer later on. */ + if (sanitize_flags_p (SANITIZE_NULL) + && TREE_CODE (expr) == COND_EXPR + && TREE_OPERAND (expr, 2) + && TREE_CODE (TREE_OPERAND (expr, 2)) == INTEGER_CST + && TREE_TYPE (TREE_OPERAND (expr, 2)) == type) + ubsan_maybe_instrument_reference (_OPERAND (expr, 2)); + + return convert_from_reference (rvalue (expr)); } /* "A glvalue of type cv1 T1 can be cast to type rvalue reference to --- gcc/cp/cp-gimplify.c.jj 2018-01-03 13:15:42.799817380 +0100 +++ gcc/cp/cp-gimplify.c2018-01-03 14:27:39.383797996 +0100 @@ -1506,6 +1506,12 @@ cp_genericize_r (tree *stmt_p, int *walk if (sanitize_flags_p (SANITIZE_VPTR) && !is_ctor) cp_ubsan_maybe_instrument_member_call (stmt); } + else if (fn == NULL_TREE + && CALL_EXPR_IFN (stmt) == IFN_UBSAN_NULL + && TREE_CODE (CALL_EXPR_ARG (stmt, 0)) == INTEGER_CST + && (TREE_CODE (TREE_TYPE (CALL_EXPR_ARG (stmt, 0))) + == REFERENCE_TYPE)) + *walk_subtrees = 0; } break; --- gcc/testsuite/g++.dg/opt/pr83555.C.jj 2018-01-03 15:17:56.294130008 +0100 +++ gcc/testsuite/g++.dg/opt/pr83555.C 2018-01-03 15:19:06.764162730 +0100 @@ -0,0 +1,15 @@ +// PR c++/83555 +// { dg-do compile } +// { dg-options "-O2 -fdump-tree-optimized -fdelete-null-pointer-checks" } + +struct A { int a; }; +struct B { int b; }; +struct C : A, B { int c; }; + +C * +foo (B *b) +{ + return _cast(*b); +} + +// { dg-final { scan-tree-dump-not "if \\(b_\[0-9]*\\(D\\) .= 0" "optimized" } } --- gcc/testsuite/g++.dg/ubsan/pr83555.C.jj 2018-01-03 14:27:39.383797996 +0100 +++ gcc/testsuite/g++.dg/ubsan/pr83555.C2018-01-03 14:27:39.383797996 +0100 @@ -0,0 +1,40 @@ +// PR c++/83555 +// { dg-do run } +// { dg-options "-fsanitize=null" } +// { dg-output ":25:\[^\n\r]*reference binding to null pointer of type 'struct C'" } + +struct A { int a; }; +struct B { int b; }; +struct C : A, B { int c; }; + +__attribute__((noipa)) C * +foo (B *b) +{ + return
Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).
On 01/03/2018 02:40 PM, Jan Hubicka wrote: + if (!original->in_same_comdat_group_p (alias)) +{ + if (dump_file) + fprintf (dump_file, "Not unifying; alias cannot be created; " + "across comdat group boundary\n\n"); + + return false; +} >>> >>> Wasn't we supposed to do the wrapper in this case? >>> >>> Honza >>> >> >> We attempt to do a wrapper, but even with wrapper we cannot introduce such >> call >> crossing the boundary. Proper message should be probably: >> >> "Not unifying; alias nor wrapper cannot be created; across comdat group >> boundary" > > If the symbol we are calling is one exported from comdat, it shoud work fine > as long as we produce gimple thunk and it the symbol comdat exports. > Of course producing call from one comdat to another may close comdat loop > (which we handle elsewhere, so i assume original is comdat and alias is not) > > I see in the PR is the split part of comdat which is comdat local, so perhaps > we want to check that original is not comdat local in addition to your current > check? Ok, you understand problematic of comdats more than me. I will test and install patch with the check added. Thanks, Martin > > Honza >> >> Martin >From 707e88333778b306f66c1bf683838b1a7bb4f737 Mon Sep 17 00:00:00 2001 From: marxinDate: Wed, 3 Jan 2018 11:10:18 +0100 Subject: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352). gcc/ChangeLog: 2018-01-03 Martin Liska PR ipa/82352 * ipa-icf.c (sem_function::merge): Do not cross comdat boundary. gcc/testsuite/ChangeLog: 2018-01-03 Martin Liska PR ipa/82352 * g++.dg/ipa/pr82352.C: New test. --- gcc/ipa-icf.c | 11 + gcc/testsuite/g++.dg/ipa/pr82352.C | 93 ++ 2 files changed, 104 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index a8d3b800318..e17c4292392 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -1113,6 +1113,17 @@ sem_function::merge (sem_item *alias_item) return false; } + if (!original->in_same_comdat_group_p (alias) + || original->comdat_local_p ()) +{ + if (dump_file) + fprintf (dump_file, + "Not unifying; alias nor wrapper cannot be created; " + "across comdat group boundary\n\n"); + + return false; +} + /* See if original is in a section that can be discarded if the main symbol is not used. */ diff --git a/gcc/testsuite/g++.dg/ipa/pr82352.C b/gcc/testsuite/g++.dg/ipa/pr82352.C new file mode 100644 index 000..c044345a486 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr82352.C @@ -0,0 +1,93 @@ +// PR ipa/82352 +// { dg-do compile } +// { dg-options "-O2" } + +typedef long unsigned int size_t; + +class A +{ +public : + typedef enum { Zero = 0, One = 1 } tA; + A(tA a) { m_a = a; } + +private : + tA m_a; +}; + +class B +{ +public : + void *operator new(size_t t) { return (void*)(42); }; +}; + +class C +{ +public: + virtual void () = 0; +}; + +class D +{ + public : + virtual void g() = 0; + virtual void h() = 0; +}; + +template class : public T, public D +{ +public: + void () + { + if (!m_i2) throw A(A::One); + }; + + void h() + { + if (m_i2) throw A(A::Zero); + } + +protected: + virtual void g() + { + if (m_i1 !=0) throw A(A::Zero); + }; + +private : + int m_i1; + void *m_i2; +}; + +class E +{ +private: +size_t m_e; +static const size_t Max; + +public: +E& i(size_t a, size_t b, size_t c) +{ +if ((a > Max) || (c > Max)) throw A(A::Zero ); +if (a + b > m_e) throw A(A::One ); +return (*this); +} + + inline E& j(const E ) +{ + return i(0,0,s.m_e); +} +}; + +class F : public C { }; +class G : public C { }; +class : public B, public F, public G { }; + +void k() +{ +new (); +} + +void l() +{ + E e1, e2; + e1.j(e2); +} -- 2.14.3
Re: [PATCH] Do not inline variadic thunks (PR ipa/83549).
On 01/03/2018 02:41 PM, Jan Hubicka wrote: >> Hi. >> >> As mentioned in the PR, we should bail out inlining of thunks with variadic >> arguments. It's problematic for cgraph_node::expand_thunk function that >> does not support variadic functions. >> >> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. >> >> Ready to be installed? >> Martin >> >> gcc/ChangeLog: >> >> 2018-01-03 Martin Liska>> >> PR ipa/83549 >> * ipa-fnsummary.c (compute_fn_summary): Do not inline variadic >> thunks. >> >> gcc/testsuite/ChangeLog: >> >> 2018-01-03 Martin Liska >> >> PR ipa/83549 >> * g++.dg/ipa/pr83549.C: New test. > > OK, but please introduce new CIF_CODE for this case. MISMATCHED arguments > is a kitchen sink for various issues and it is very hard to analyze what > happened when it triggers. Fully agree, I'm attaching patch that I'll commit soon. Martin > > Thanks, > Honza >> --- >> gcc/ipa-fnsummary.c| 5 + >> gcc/testsuite/g++.dg/ipa/pr83549.C | 8 >> 2 files changed, 13 insertions(+) >> create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C >> >> > >> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c >> index 94150312105..274bd8c6758 100644 >> --- a/gcc/ipa-fnsummary.c >> +++ b/gcc/ipa-fnsummary.c >> @@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool >> early) >>info->inlinable = false; >>node->callees->inline_failed = CIF_CHKP; >> } >> + else if (stdarg_p (TREE_TYPE (node->decl))) >> +{ >> + info->inlinable = false; >> + node->callees->inline_failed = CIF_MISMATCHED_ARGUMENTS; >> +} >>else >> info->inlinable = true; >> } >> diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C >> b/gcc/testsuite/g++.dg/ipa/pr83549.C >> new file mode 100644 >> index 000..90cf8fe7e0d >> --- /dev/null >> +++ b/gcc/testsuite/g++.dg/ipa/pr83549.C >> @@ -0,0 +1,8 @@ >> +// PR ipa/83549 >> +// { dg-do compile } >> +// { dg-options "-O2" } >> + >> +struct A { virtual ~A (); }; >> +struct B { virtual void foo (...); }; >> +struct C : A, B { void foo (...) {} }; >> +C c; >> > >From d890dbc95bdc57d13f26759888ef48f4b492f53e Mon Sep 17 00:00:00 2001 From: marxin Date: Wed, 3 Jan 2018 13:23:46 +0100 Subject: [PATCH] Do not inline variadic thunks (PR ipa/83549). gcc/ChangeLog: 2018-01-03 Martin Liska PR ipa/83549 * cif-code.def (VARIADIC_THUNK): New enum value. * ipa-fnsummary.c (compute_fn_summary): Do not inline variadic thunks. gcc/testsuite/ChangeLog: 2018-01-03 Martin Liska PR ipa/83549 * g++.dg/ipa/pr83549.C: New test. --- gcc/cif-code.def | 4 gcc/ipa-fnsummary.c| 5 + gcc/testsuite/g++.dg/ipa/pr83549.C | 8 3 files changed, 17 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C diff --git a/gcc/cif-code.def b/gcc/cif-code.def index 92d81d30a49..a587e712fdf 100644 --- a/gcc/cif-code.def +++ b/gcc/cif-code.def @@ -95,6 +95,10 @@ DEFCIFCODE(MISMATCHED_ARGUMENTS, CIF_FINAL_ERROR, DEFCIFCODE(LTO_MISMATCHED_DECLARATIONS, CIF_FINAL_ERROR, N_("mismatched declarations during linktime optimization")) +/* Caller is variadic thunk. */ +DEFCIFCODE(VARIADIC_THUNK, CIF_FINAL_ERROR, + N_("variadic thunk call")) + /* Call was originally indirect. */ DEFCIFCODE(ORIGINALLY_INDIRECT_CALL, CIF_FINAL_NORMAL, N_("originally indirect function call not considered for inlining")) diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 94150312105..9b1b7daca2e 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool early) info->inlinable = false; node->callees->inline_failed = CIF_CHKP; } + else if (stdarg_p (TREE_TYPE (node->decl))) + { + info->inlinable = false; + node->callees->inline_failed = CIF_VARIADIC_THUNK; + } else info->inlinable = true; } diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C b/gcc/testsuite/g++.dg/ipa/pr83549.C new file mode 100644 index 000..90cf8fe7e0d --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr83549.C @@ -0,0 +1,8 @@ +// PR ipa/83549 +// { dg-do compile } +// { dg-options "-O2" } + +struct A { virtual ~A (); }; +struct B { virtual void foo (...); }; +struct C : A, B { void foo (...) {} }; +C c; -- 2.14.3
[PATCH PR82439][simplify-rtx] Simplify (x | y) == x -> (y & ~x) == 0
Hi This patch add support for the missing transformation of (x | y) == x -> (y & ~x) == 0. The transformation for (x & y) == x case already exists in simplify-rtx.c since 2014 as of r218503 and this patch only adds a couple of extra patterns for the IOR case. Citing the example given in PR82439 : Simple testcase (f1 should generate the same as f2): int f1(int x, int y) { return (x | y) == x; } int f2(int x, int y) { return (y & ~x) == 0; } f1: orr w1, w0, w1 cmp w1, w0 csetw0, eq ret f2: bicswzr, w1, w0 csetw0, eq ret This benefits targets that have the BICS instruction to generate better code. Wilco helped me in showing that even in targets that do not have the BICS instruction, this is no worse and gives out 2 instructions. For example in x86: : 0: 09 fe or %edi,%esi 2: 31 c0 xor%eax,%eax 4: 39 fe cmp%edi,%esi 6: 0f 94 c0sete %al 9: c3 retq 0010 : 10: f7 d7 not%edi 12: 31 c0 xor%eax,%eax 14: 85 f7 test %esi,%edi 16: 0f 94 c0sete %al 19: c3 retq Testing done: Checked for regressions on bootstrapped aarch64-none-linux-gnu and arm-none-linux-gnueabihf and added new test cases. Is this ok for trunk? Sudi ChangeLog Entries: *** gcc/ChangeLog *** 2017-01-03 Sudakshina DasPR target/82439 * simplify-rtx.c (simplify_relational_operation_1): Add simplifications of (x|y) == x for BICS pattern. *** gcc/testsuite/ChangeLog *** 2017-01-03 Sudakshina Das PR target/82439 * gcc.target/aarch64/bics_5.c: New test. * gcc.target/arm/bics_5.c: Likewise.
Re: [PATCH] Do not inline variadic thunks (PR ipa/83549).
> Hi. > > As mentioned in the PR, we should bail out inlining of thunks with variadic > arguments. It's problematic for cgraph_node::expand_thunk function that > does not support variadic functions. > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Ready to be installed? > Martin > > gcc/ChangeLog: > > 2018-01-03 Martin Liska> > PR ipa/83549 > * ipa-fnsummary.c (compute_fn_summary): Do not inline variadic > thunks. > > gcc/testsuite/ChangeLog: > > 2018-01-03 Martin Liska > > PR ipa/83549 > * g++.dg/ipa/pr83549.C: New test. OK, but please introduce new CIF_CODE for this case. MISMATCHED arguments is a kitchen sink for various issues and it is very hard to analyze what happened when it triggers. Thanks, Honza > --- > gcc/ipa-fnsummary.c| 5 + > gcc/testsuite/g++.dg/ipa/pr83549.C | 8 > 2 files changed, 13 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C > > > diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c > index 94150312105..274bd8c6758 100644 > --- a/gcc/ipa-fnsummary.c > +++ b/gcc/ipa-fnsummary.c > @@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool > early) >info->inlinable = false; >node->callees->inline_failed = CIF_CHKP; > } > + else if (stdarg_p (TREE_TYPE (node->decl))) > + { > + info->inlinable = false; > + node->callees->inline_failed = CIF_MISMATCHED_ARGUMENTS; > + } >else > info->inlinable = true; > } > diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C > b/gcc/testsuite/g++.dg/ipa/pr83549.C > new file mode 100644 > index 000..90cf8fe7e0d > --- /dev/null > +++ b/gcc/testsuite/g++.dg/ipa/pr83549.C > @@ -0,0 +1,8 @@ > +// PR ipa/83549 > +// { dg-do compile } > +// { dg-options "-O2" } > + > +struct A { virtual ~A (); }; > +struct B { virtual void foo (...); }; > +struct C : A, B { void foo (...) {} }; > +C c; >
Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).
> >> + if (!original->in_same_comdat_group_p (alias)) > >> +{ > >> + if (dump_file) > >> + fprintf (dump_file, "Not unifying; alias cannot be created; " > >> + "across comdat group boundary\n\n"); > >> + > >> + return false; > >> +} > > > > Wasn't we supposed to do the wrapper in this case? > > > > Honza > > > > We attempt to do a wrapper, but even with wrapper we cannot introduce such > call > crossing the boundary. Proper message should be probably: > > "Not unifying; alias nor wrapper cannot be created; across comdat group > boundary" If the symbol we are calling is one exported from comdat, it shoud work fine as long as we produce gimple thunk and it the symbol comdat exports. Of course producing call from one comdat to another may close comdat loop (which we handle elsewhere, so i assume original is comdat and alias is not) I see in the PR is the split part of comdat which is comdat local, so perhaps we want to check that original is not comdat local in addition to your current check? Honza > > Martin
[PATCH] Do not inline variadic thunks (PR ipa/83549).
Hi. As mentioned in the PR, we should bail out inlining of thunks with variadic arguments. It's problematic for cgraph_node::expand_thunk function that does not support variadic functions. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin gcc/ChangeLog: 2018-01-03 Martin LiskaPR ipa/83549 * ipa-fnsummary.c (compute_fn_summary): Do not inline variadic thunks. gcc/testsuite/ChangeLog: 2018-01-03 Martin Liska PR ipa/83549 * g++.dg/ipa/pr83549.C: New test. --- gcc/ipa-fnsummary.c| 5 + gcc/testsuite/g++.dg/ipa/pr83549.C | 8 2 files changed, 13 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ipa/pr83549.C diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 94150312105..274bd8c6758 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -2422,6 +2422,11 @@ compute_fn_summary (struct cgraph_node *node, bool early) info->inlinable = false; node->callees->inline_failed = CIF_CHKP; } + else if (stdarg_p (TREE_TYPE (node->decl))) + { + info->inlinable = false; + node->callees->inline_failed = CIF_MISMATCHED_ARGUMENTS; + } else info->inlinable = true; } diff --git a/gcc/testsuite/g++.dg/ipa/pr83549.C b/gcc/testsuite/g++.dg/ipa/pr83549.C new file mode 100644 index 000..90cf8fe7e0d --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr83549.C @@ -0,0 +1,8 @@ +// PR ipa/83549 +// { dg-do compile } +// { dg-options "-O2" } + +struct A { virtual ~A (); }; +struct B { virtual void foo (...); }; +struct C : A, B { void foo (...) {} }; +C c;
Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).
On 01/03/2018 02:24 PM, Jan Hubicka wrote: >> Hi. >> >> This patch is follow-up of r246848. This time ICF creates an edge between 2 >> functions, >> where one is inside a comdat group and second is not. I've got patch that is >> conservative >> about the comdat groups (in_same_comdat_group_p). >> >> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. >> >> Ready to be installed? >> Martin >> >> gcc/ChangeLog: >> >> 2018-01-03 Martin Liska>> >> PR ipa/82352 >> * ipa-icf.c (sem_function::merge): Do not cross comdat boundary. >> >> gcc/testsuite/ChangeLog: >> >> 2018-01-03 Martin Liska >> >> PR ipa/82352 >> * g++.dg/ipa/pr82352.C: New test. >> --- >> gcc/ipa-icf.c | 9 >> gcc/testsuite/g++.dg/ipa/pr82352.C | 93 >> ++ >> 2 files changed, 102 insertions(+) >> create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C >> >> > >> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c >> index a8d3b800318..a56c7306201 100644 >> --- a/gcc/ipa-icf.c >> +++ b/gcc/ipa-icf.c >> @@ -1113,6 +1113,15 @@ sem_function::merge (sem_item *alias_item) >>return false; >> } >> >> + if (!original->in_same_comdat_group_p (alias)) >> +{ >> + if (dump_file) >> +fprintf (dump_file, "Not unifying; alias cannot be created; " >> + "across comdat group boundary\n\n"); >> + >> + return false; >> +} > > Wasn't we supposed to do the wrapper in this case? > > Honza > We attempt to do a wrapper, but even with wrapper we cannot introduce such call crossing the boundary. Proper message should be probably: "Not unifying; alias nor wrapper cannot be created; across comdat group boundary" Martin
Re: PR83648
> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c > index 09ca3590039..0406d5588d2 100644 > --- a/gcc/ipa-pure-const.c > +++ b/gcc/ipa-pure-const.c > @@ -910,7 +910,8 @@ malloc_candidate_p (function *fun, bool ipa) > #define DUMP_AND_RETURN(reason) \ > { \ >if (dump_file && (dump_flags & TDF_DETAILS)) \ > -fprintf (dump_file, "%s", (reason)); \ > +fprintf (dump_file, "\n%s is not a malloc candidate, reason: %s\n", \ > + (node->name()), (reason)); \ >return false; \ > } > > @@ -961,11 +962,14 @@ malloc_candidate_p (function *fun, bool ipa) > for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i) > { > tree arg = gimple_phi_arg_def (phi, i); > + if (arg == null_pointer_node) > + continue; I think you want operand_equal_p here and also check for flag_delete_null_pointer_checks because otherwise 0 can be legal memory block addrss. > + > if (TREE_CODE (arg) != SSA_NAME) > - DUMP_AND_RETURN("phi arg is not SSA_NAME.") > - if (!(arg == null_pointer_node || check_retval_uses (arg, phi))) > - DUMP_AND_RETURN("phi arg has uses outside phi" > - " and comparisons against 0.") > + DUMP_AND_RETURN ("phi arg is not SSA_NAME."); > + if (!check_retval_uses (arg, phi)) > + DUMP_AND_RETURN ("phi arg has uses outside phi" > +" and comparisons against 0.") So the case of phi(0,0) is not really correctness issue just missed optimization? I would still add code to handle it (it is easy). Do you also handle a case where parameter of phi is another phi? Honza > > gimple *arg_def = SSA_NAME_DEF_STMT (arg); > gcall *call_stmt = dyn_cast (arg_def); > diff --git a/gcc/testsuite/gcc.dg/ipa/pr83648.c > b/gcc/testsuite/gcc.dg/ipa/pr83648.c > new file mode 100644 > index 000..03b45de671b > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/ipa/pr83648.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-local-pure-const-details" } */ > + > +void *g(unsigned n) > +{ > + return n ? __builtin_malloc (n) : 0; > +} > + > +/* { dg-final { scan-tree-dump "Function found to be malloc: g" > "local-pure-const1" } } */
[PATCH] Add version to intermediate gcov file (PR gcov-profile/83669).
Hi. This is small enhancement reported by users of gcov tool. I'm aware of current stage of GCC, but it's really small change in code. Apart from that, a small fix to documentation is included. Survives gcov.exp, may I install it? Thanks, Martin gcc/ChangeLog: 2018-01-03 Martin Liska <mli...@suse.cz> PR gcov-profile/83669 * gcov.c (output_intermediate_file): Add version to intermediate gcov file. * doc/gcov.texi: Document new field 'version' in intermediate file format. Fix location of '-k' option of gcov command. --- gcc/doc/gcov.texi | 10 ++ gcc/gcov.c| 1 + 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi index 8bf422e58d8..be7364d9da9 100644 --- a/gcc/doc/gcov.texi +++ b/gcc/doc/gcov.texi @@ -187,11 +187,8 @@ be used by @command{lcov} or other tools. The output is a single The format of the intermediate @file{.gcov} file is plain text with one entry per line -@item -j -@itemx --human-readable -Write counts in human readable format (like 24k). - @smallexample +version:@var{gcc_version} file:@var{source_file_name} function:@var{start_line_number},@var{end_line_number},@var{execution_count},@var{function_name} lcount:@var{line number},@var{execution_count},@var{has_unexecuted_block} @@ -212,6 +209,7 @@ times. Here is a sample when @option{-i} is used in conjunction with @option{-b} option: @smallexample +version: 8.1.0 20180103 file:tmp.cpp function:7,7,0,_ZN3FooIcEC2Ev function:7,7,1,_ZN3FooIiEC2Ev @@ -252,6 +250,10 @@ branch:35,nottaken lcount:36,1,0 @end smallexample +@item -j +@itemx --human-readable +Write counts in human readable format (like 24k). + @item -k @itemx --use-colors diff --git a/gcc/gcov.c b/gcc/gcov.c index 24e6da09fcf..3c7881b13fa 100644 --- a/gcc/gcov.c +++ b/gcc/gcov.c @@ -1035,6 +1035,7 @@ file 'foo.cc.gcov' similar to the above example. */ static void output_intermediate_file (FILE *gcov_file, source_info *src) { + fprintf (gcov_file, "version:%s\n", version_string); fprintf (gcov_file, "file:%s\n", src->name);/* source file name */ std::sort (src->functions.begin (), src->functions.end (),
Re: [PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).
> Hi. > > This patch is follow-up of r246848. This time ICF creates an edge between 2 > functions, > where one is inside a comdat group and second is not. I've got patch that is > conservative > about the comdat groups (in_same_comdat_group_p). > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Ready to be installed? > Martin > > gcc/ChangeLog: > > 2018-01-03 Martin Liska> > PR ipa/82352 > * ipa-icf.c (sem_function::merge): Do not cross comdat boundary. > > gcc/testsuite/ChangeLog: > > 2018-01-03 Martin Liska > > PR ipa/82352 > * g++.dg/ipa/pr82352.C: New test. > --- > gcc/ipa-icf.c | 9 > gcc/testsuite/g++.dg/ipa/pr82352.C | 93 > ++ > 2 files changed, 102 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C > > > diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c > index a8d3b800318..a56c7306201 100644 > --- a/gcc/ipa-icf.c > +++ b/gcc/ipa-icf.c > @@ -1113,6 +1113,15 @@ sem_function::merge (sem_item *alias_item) >return false; > } > > + if (!original->in_same_comdat_group_p (alias)) > +{ > + if (dump_file) > + fprintf (dump_file, "Not unifying; alias cannot be created; " > + "across comdat group boundary\n\n"); > + > + return false; > +} Wasn't we supposed to do the wrapper in this case? Honza
Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).
On Wed, Jan 03, 2018 at 02:07:36PM +0100, Martin Liška wrote: > 2018-01-03 Martin Liska> > PR tree-optimization/83593 > * tree-ssa-strlen.c (strlen_check_and_optimize_stmt): Clean-up > EH gimple statements. > (strlen_dom_walker::before_dom_children): Call > gimple_purge_dead_eh_edges. > (pass_strlen::execute): Return TODO_cleanup_cfg if needed. The ChangeLog entry doesn't contain all the changes, like: > @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see > #include "gimple-iterator.h" > #include "gimplify-me.h" > #include "expr.h" > +#include "tree-cfg.h" > #include "tree-dfa.h" > #include "domwalk.h" > #include "tree-ssa-alias.h" the above one. > static bool > -strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi) > +strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh) This one (i.e. addition of a new argument). > @@ -3318,10 +3337,16 @@ do_invalidate (basic_block dombb, gimple *phi, bitmap > visited, int *count) > class strlen_dom_walker : public dom_walker > { > public: > - strlen_dom_walker (cdi_direction direction) : dom_walker (direction) {} > + strlen_dom_walker (cdi_direction direction) > +: dom_walker (direction), m_cleanup_cfg (false) > + {} This one. > >virtual edge before_dom_children (basic_block); >virtual void after_dom_children (basic_block); > + > + /* Flag that will trigger TODO_cleanup_cfg to be returned in strlen > + execute function. */ > + bool m_cleanup_cfg; This one too. > + bool cleanup_eh = false; > + >/* Attempt to optimize individual statements. */ >for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) > -if (strlen_check_and_optimize_stmt ()) > +if (strlen_check_and_optimize_stmt (, _eh)) >gsi_next (); And the fact that strlen_check_and_optimize_stmt caller has been adjusted. > > + if (cleanup_eh) > +{ > + gimple_purge_dead_eh_edges (bb); > + m_cleanup_cfg = true; > +} This should be if (cleanup_eh && gimple_purge_dead_eh_edges (bb)) m_cleanup_cfg = true; Ok with that change and updated ChangeLog entry if it passes bootstrap/regtest. Jakub
Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).
On 01/03/2018 01:50 PM, Jakub Jelinek wrote: > If gimple_purge_dead_eh_edges returns true, you want to arrange for the > pass to return TODO_cleanup_cfg (probably needs to use some global static > variable to propagate that). > > Jakub Hi. Sending v2. I'm suggesting to propagate that in strlen_dom_walker::m_cleanup_cfg. I've just triggered tests. Ready to install after it finishes? >From 217982bae6969865051f12798e4da444dd4aaa3f Mon Sep 17 00:00:00 2001 From: marxinDate: Wed, 3 Jan 2018 12:15:40 +0100 Subject: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593). gcc/ChangeLog: 2018-01-03 Martin Liska PR tree-optimization/83593 * tree-ssa-strlen.c (strlen_check_and_optimize_stmt): Clean-up EH gimple statements. (strlen_dom_walker::before_dom_children): Call gimple_purge_dead_eh_edges. (pass_strlen::execute): Return TODO_cleanup_cfg if needed. gcc/testsuite/ChangeLog: 2018-01-03 Martin Liska PR tree-optimization/83593 * gcc.dg/pr83593.c: New test. --- gcc/testsuite/gcc.dg/pr83593.c | 15 + gcc/tree-ssa-strlen.c | 48 -- 2 files changed, 56 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr83593.c diff --git a/gcc/testsuite/gcc.dg/pr83593.c b/gcc/testsuite/gcc.dg/pr83593.c new file mode 100644 index 000..eddecc0606a --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr83593.c @@ -0,0 +1,15 @@ +/* PR tree-optimization/83593 */ +/* { dg-options "-O2 -fno-tree-dominator-opts -fnon-call-exceptions -fno-tree-pre -fexceptions -fno-code-hoisting -fno-tree-fre" } */ + +void +hr (int *ed, signed char *ju) +{ + int kc; +{ + int xj; + int *q2 = (*ed == 0) ? : + + *ju = 0; + kc = *ju; +} +} diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index be6ab9f1e1b..7ad6ff8228c 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-iterator.h" #include "gimplify-me.h" #include "expr.h" +#include "tree-cfg.h" #include "tree-dfa.h" #include "domwalk.h" #include "tree-ssa-alias.h" @@ -3051,10 +3052,12 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple *stmt) } /* Attempt to check for validity of the performed access a single statement - at *GSI using string length knowledge, and to optimize it. */ + at *GSI using string length knowledge, and to optimize it. + If the given basic block needs clean-up of EH, CLEANUP_EH is set to + true. */ static bool -strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi) +strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh) { gimple *stmt = gsi_stmt (*gsi); @@ -3201,11 +3204,27 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi) if (w1 == w2 && si->full_string_p) { + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + /* Reading the final '\0' character. */ tree zero = build_int_cst (TREE_TYPE (lhs), 0); gimple_set_vuse (stmt, NULL_TREE); gimple_assign_set_rhs_from_tree (gsi, zero); - update_stmt (gsi_stmt (*gsi)); + *cleanup_eh + |= maybe_clean_or_replace_eh_stmt (stmt, + gsi_stmt (*gsi)); + stmt = gsi_stmt (*gsi); + update_stmt (stmt); + + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } } else if (w1 > w2) { @@ -3318,10 +3337,16 @@ do_invalidate (basic_block dombb, gimple *phi, bitmap visited, int *count) class strlen_dom_walker : public dom_walker { public: - strlen_dom_walker (cdi_direction direction) : dom_walker (direction) {} + strlen_dom_walker (cdi_direction direction) +: dom_walker (direction), m_cleanup_cfg (false) + {} virtual edge before_dom_children (basic_block); virtual void after_dom_children (basic_block); + + /* Flag that will trigger TODO_cleanup_cfg to be returned in strlen + execute function. */ + bool m_cleanup_cfg; }; /* Callback for walk_dominator_tree. Attempt to optimize various @@ -3399,11 +3424,19 @@ strlen_dom_walker::before_dom_children (basic_block bb) } } + bool cleanup_eh = false; + /* Attempt to optimize individual statements. */ for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) -if (strlen_check_and_optimize_stmt ()) +if (strlen_check_and_optimize_stmt (, _eh)) gsi_next (); + if (cleanup_eh) +{ + gimple_purge_dead_eh_edges (bb); + m_cleanup_cfg = true; +} + bb->aux = stridx_to_strinfo; if (vec_safe_length (stridx_to_strinfo) && !strinfo_shared ()) (*stridx_to_strinfo)[0] = (strinfo *) bb; @@ -3477,7 +3510,8 @@ pass_strlen::execute (function
Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)
On Wed, Jan 03, 2018 at 01:49:11PM +0100, Richard Biener wrote: > On January 3, 2018 1:21:40 PM GMT+01:00, Nathan Sidwell> wrote: > >On 01/02/2018 04:12 PM, Jakub Jelinek wrote: > >> Hi! > >> > >> This patch improves code generated for: > >> struct A { int a; }; > >> struct B { int b; }; > >> struct C : A, B { int c; }; > >> C *bar (B *b) { return _cast(*b); } > >> Unlike return static_cast(b); where b can be validly NULL, the > >> reference shouldn't bind to NULL, but we still emit > >> b ? b - 4 : 0. The following patch omits the non-NULL check except > >when > >> -fsanitize=null (or undefined) and when sanitizing makes sure such > >bugs are > >> diagnosed. > > > >It's sad the optimizers don't know REFERENCE_TYPE (x) means x != NULL. > >(or perhaps that's just a C++ semantic of REFERENCE_TYPE?). > > Given we treat reference and pointer types as interchangeable we indeed don't > know that. Yeah, within functions that is quickly lost, sometimes even during folding as in this case. In some cases the optimizers can infer that, e.g. nonnull_arg_p has: /* Values passed by reference are always non-NULL. */ if (TREE_CODE (TREE_TYPE (arg)) == REFERENCE_TYPE && flag_delete_null_pointer_checks) return true; because we don't change randomly types of function arguments... Though, the above snippet reminds me I should probably also replace: + expr = build_base_path (MINUS_EXPR, expr, base, + /*nonnull=*/!sanitize_null_p, complain); with: + expr = build_base_path (MINUS_EXPR, expr, base, + /*nonnull=*/(!sanitize_null_p + && flag_delete_null_pointer_checks), + complain); Instead of checking for generated asm without sanitization, I think it will be easier/more portable to verify no conditionals in *.optimized dump. Let me repost the updated patch. Jakub
Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).
On 01/03/2018 01:49 PM, Marc Glisse wrote: > On Wed, 3 Jan 2018, Martin Liška wrote: > > + *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt, > + stmt); > > Do you mean *cleanup_eh |= ... ? > Yes. Thanks!
Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).
On Wed, Jan 03, 2018 at 01:27:01PM +0100, Martin Liška wrote: > /* Reading the final '\0' character. */ > tree zero = build_int_cst (TREE_TYPE (lhs), 0); > gimple_set_vuse (stmt, NULL_TREE); > gimple_assign_set_rhs_from_tree (gsi, zero); > + *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt, > + stmt); The second stmt should be gsi_stmt (*gsi) just in case gimple_assign_set_rhs_from_tree can't modify in-place, and probably you need *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt, gsi_stmt (*gsi)); then to get formatting right. > update_stmt (gsi_stmt (*gsi)); > + > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "into: "); > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); Again, should be gsi_stmt (*gsi);. Or do: stmt = gsi_stmt (*gsi); above update_stmt. As stmt is used several times later, I think that is my preference (though, in maybe_clean_or_replace_eh_stmt call you IMHO still want gsi_stmt repeated). > + } > } > else if (w1 > w2) > { > @@ -3399,11 +3416,16 @@ strlen_dom_walker::before_dom_children (basic_block > bb) > } > } > > + bool cleanup_eh = false; > + >/* Attempt to optimize individual statements. */ >for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) > -if (strlen_check_and_optimize_stmt ()) > +if (strlen_check_and_optimize_stmt (, _eh)) >gsi_next (); > > + if (cleanup_eh) > +gimple_purge_dead_eh_edges (bb); If gimple_purge_dead_eh_edges returns true, you want to arrange for the pass to return TODO_cleanup_cfg (probably needs to use some global static variable to propagate that). Jakub
Re: [PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).
On Wed, 3 Jan 2018, Martin Liška wrote: + *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt, + stmt); Do you mean *cleanup_eh |= ... ? -- Marc Glisse
Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)
On January 3, 2018 1:21:40 PM GMT+01:00, Nathan Sidwellwrote: >On 01/02/2018 04:12 PM, Jakub Jelinek wrote: >> Hi! >> >> This patch improves code generated for: >> struct A { int a; }; >> struct B { int b; }; >> struct C : A, B { int c; }; >> C *bar (B *b) { return _cast(*b); } >> Unlike return static_cast(b); where b can be validly NULL, the >> reference shouldn't bind to NULL, but we still emit >> b ? b - 4 : 0. The following patch omits the non-NULL check except >when >> -fsanitize=null (or undefined) and when sanitizing makes sure such >bugs are >> diagnosed. > >It's sad the optimizers don't know REFERENCE_TYPE (x) means x != NULL. >(or perhaps that's just a C++ semantic of REFERENCE_TYPE?). Given we treat reference and pointer types as interchangeable we indeed don't know that. Do we >manage to elide the check if we eventually dereference the pointer? We eventually should via path isolation of the null dereference. But better check that ;) >(Not that that'd be an easy fix, but maybe worth a (new?) bug report.) > >Your patch is fine, but could you add a test case to make sure the null > >check is not there in the output assembly -- it'd be $cpu-of-choice >specific, of course. > >nathan
[PATCH] Clean-up EH after strlen transformation (PR tree-optimization/83593).
Hi. Strlen pass does following transformation: Optimizing: _7 = *ju_5(D); into: _7 = 0; which leads to need of removal of EH for the gimple statement. That's done via maybe_clean_or_replace_eh_stmt and then we need to call gimple_purge_dead_eh_edges. Last question I have is whether it's also needed to return TODO_cleanup_cfg or not? Patch as is can bootstrap on ppc64le-redhat-linux and survives regression tests. Martin gcc/ChangeLog: 2018-01-03 Martin LiskaPR tree-optimization/83593 * tree-ssa-strlen.c (strlen_check_and_optimize_stmt): Clean-up EH gimple statements. (strlen_dom_walker::before_dom_children): Call gimple_purge_dead_eh_edges. gcc/testsuite/ChangeLog: 2018-01-03 Martin Liska PR tree-optimization/83593 * gcc.dg/pr83593.c: New test. --- gcc/testsuite/gcc.dg/pr83593.c | 15 +++ gcc/tree-ssa-strlen.c | 28 +--- 2 files changed, 40 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr83593.c diff --git a/gcc/testsuite/gcc.dg/pr83593.c b/gcc/testsuite/gcc.dg/pr83593.c new file mode 100644 index 000..eddecc0606a --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr83593.c @@ -0,0 +1,15 @@ +/* PR tree-optimization/83593 */ +/* { dg-options "-O2 -fno-tree-dominator-opts -fnon-call-exceptions -fno-tree-pre -fexceptions -fno-code-hoisting -fno-tree-fre" } */ + +void +hr (int *ed, signed char *ju) +{ + int kc; +{ + int xj; + int *q2 = (*ed == 0) ? : + + *ju = 0; + kc = *ju; +} +} diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index be6ab9f1e1b..293aeceacce 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-iterator.h" #include "gimplify-me.h" #include "expr.h" +#include "tree-cfg.h" #include "tree-dfa.h" #include "domwalk.h" #include "tree-ssa-alias.h" @@ -3051,10 +3052,12 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple *stmt) } /* Attempt to check for validity of the performed access a single statement - at *GSI using string length knowledge, and to optimize it. */ + at *GSI using string length knowledge, and to optimize it. + If the given basic block needs clean-up of EH, CLEANUP_EH is set to + true. */ static bool -strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi) +strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh) { gimple *stmt = gsi_stmt (*gsi); @@ -3201,11 +3204,25 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi) if (w1 == w2 && si->full_string_p) { + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + /* Reading the final '\0' character. */ tree zero = build_int_cst (TREE_TYPE (lhs), 0); gimple_set_vuse (stmt, NULL_TREE); gimple_assign_set_rhs_from_tree (gsi, zero); + *cleanup_eh = maybe_clean_or_replace_eh_stmt (stmt, + stmt); update_stmt (gsi_stmt (*gsi)); + + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } } else if (w1 > w2) { @@ -3399,11 +3416,16 @@ strlen_dom_walker::before_dom_children (basic_block bb) } } + bool cleanup_eh = false; + /* Attempt to optimize individual statements. */ for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) -if (strlen_check_and_optimize_stmt ()) +if (strlen_check_and_optimize_stmt (, _eh)) gsi_next (); + if (cleanup_eh) +gimple_purge_dead_eh_edges (bb); + bb->aux = stridx_to_strinfo; if (vec_safe_length (stridx_to_strinfo) && !strinfo_shared ()) (*stridx_to_strinfo)[0] = (strinfo *) bb;
Re: C++ PATCH to fix -Wparentheses with MVP (PR c++/83592)
On 01/03/2018 06:59 AM, Marek Polacek wrote: Since the -Wparentheses extension regarding Most Vexing Parse we started warning on e.g. "reinterpret_cast()". I don't think the warning was meant to warn in this situation, since in reinterpret_cast there's no decl/expr disambiguation, is it? It seems we should simply disable the warning in TYPENAME context. ok thanks ('' looks weird to me, but if the warning is annoying, so be it.) nathan -- Nathan Sidwell
Re: [PATCH] PR 78534 Change character length from int to size_t
On Wed, Jan 3, 2018 at 2:10 PM, Thomas Koenigwrote: > Hi Janne, > >> attached is a patch that makes the two attached testcases work. It >> applies on top of the charlen->size_t patch. In the formatted I/O >> stuff, I have mostly used ptrdiff_t to avoid having to deal with >> signed/unsigned issues, as the previous code was using int. > > > Did you regression-test? Ah yes, I forgot to mention that. Yes, I did, though only on x86_64-linux-gnu > If yes, I'd say this patch is OK (all the changes look obvious > enough). > > With this, your character length patch is also OK. We can then > open individual PRs for the other issues. > > However, I'd ask you to wait for a day or so with committing > so that other people also have a chance for a (final) look > at this. Thanks! Dominique mentioned on IRC about some incoming comments, so I guess it makes sense to wait a few days. -- Janne Blomqvist
Re: [C++ PATCH] Improve code generation for static_cast of pointers to reference types (PR c++/83555)
On 01/02/2018 04:12 PM, Jakub Jelinek wrote: Hi! This patch improves code generated for: struct A { int a; }; struct B { int b; }; struct C : A, B { int c; }; C *bar (B *b) { return _cast(*b); } Unlike return static_cast(b); where b can be validly NULL, the reference shouldn't bind to NULL, but we still emit b ? b - 4 : 0. The following patch omits the non-NULL check except when -fsanitize=null (or undefined) and when sanitizing makes sure such bugs are diagnosed. It's sad the optimizers don't know REFERENCE_TYPE (x) means x != NULL. (or perhaps that's just a C++ semantic of REFERENCE_TYPE?). Do we manage to elide the check if we eventually dereference the pointer? (Not that that'd be an easy fix, but maybe worth a (new?) bug report.) Your patch is fine, but could you add a test case to make sure the null check is not there in the output assembly -- it'd be $cpu-of-choice specific, of course. nathan -- Nathan Sidwell
Re: [C++ PATCH] Fix ICE in ~macro_use_before_def (PR preprocessor/83602)
On 01/02/2018 04:07 PM, Jakub Jelinek wrote: Hi! If lookup_name_fuzzy finds an exact match with a macro, it later in the dtor uses node->value.macro->line in libcpp. The problem is that for builtin macros node->value.macro contains garbage, we use node->value.builtin union member in those cases instead. It doesn't make any sense to look up location of a builtin macro definition anyway, those are very special entities. So, this patch just ignores those, because we can't do anything useful with them. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-01-02 Jakub JelinekPR preprocessor/83602 * name-lookup.c (lookup_name_fuzzy): Don't use macro_use_before_def for builtin macros. * g++.dg/cpp/pr83602.C: New test. ok -- Nathan Sidwell
Re: [C++ PATCH] Avoid NOP_EXPRs with error_mark_node operand (PR c++/83634)
On 01/02/2018 04:03 PM, Jakub Jelinek wrote: Hi! The gimplifier uses in several places STRIP_USELESS_TYPE_CONVERSION and that, being primarily a middle-end predicate, doesn't like error_mark_nodes appearing in conversion operands. I've talked about it with Richi on IRC and he'd prefer not to change it. 2018-01-02 Jakub JelinekPR c++/83634 * cp-gimplify.c (cp_fold) : If the operand folds to error_mark_node, return error_mark_node. * g++.dg/parse/pr83634.C: New test. ok thanks -- Nathan Sidwell
Re: [PATCH] PR 78534 Change character length from int to size_t
Hi Janne, attached is a patch that makes the two attached testcases work. It applies on top of the charlen->size_t patch. In the formatted I/O stuff, I have mostly used ptrdiff_t to avoid having to deal with signed/unsigned issues, as the previous code was using int. Did you regression-test? If yes, I'd say this patch is OK (all the changes look obvious enough). With this, your character length patch is also OK. We can then open individual PRs for the other issues. However, I'd ask you to wait for a day or so with committing so that other people also have a chance for a (final) look at this. Regards Thomas
C++ PATCH to fix -Wparentheses with MVP (PR c++/83592)
Since the -Wparentheses extension regarding Most Vexing Parse we started warning on e.g. "reinterpret_cast()". I don't think the warning was meant to warn in this situation, since in reinterpret_cast there's no decl/expr disambiguation, is it? It seems we should simply disable the warning in TYPENAME context. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2018-01-03 Marek Polacek PR c++/83592 * decl.c (grokdeclarator): Don't warn about MVP in typename context. * g++.dg/warn/mvp2.C: New test. diff --git gcc/cp/decl.c gcc/cp/decl.c index f7b03e13303..b1c50961169 100644 --- gcc/cp/decl.c +++ gcc/cp/decl.c @@ -10866,10 +10866,11 @@ grokdeclarator (const cp_declarator *declarator, inner_declarator = declarator->declarator; - /* We don't want to warn in parmeter context because we don't + /* We don't want to warn in parameter context because we don't yet know if the parse will succeed, and this might turn out to be a constructor call. */ if (decl_context != PARM + && decl_context != TYPENAME && declarator->parenthesized != UNKNOWN_LOCATION /* If the type is class-like and the inner name used a global namespace qualifier, we need the parens. diff --git gcc/testsuite/g++.dg/warn/mvp2.C gcc/testsuite/g++.dg/warn/mvp2.C index e69de29bb2d..9b2793c5622 100644 --- gcc/testsuite/g++.dg/warn/mvp2.C +++ gcc/testsuite/g++.dg/warn/mvp2.C @@ -0,0 +1,24 @@ +// PR c++/83592 +// { dg-do compile } +// { dg-options "-Wparentheses" } + +// Test that -Wparentheses does not give bogus warnings in +// typename context. + +int * +foo (long ) +{ + return reinterpret_cast (); +} + +int * +bar (long ) +{ + return (int (*)) +} + +int * +baz (int ) +{ + return static_cast (); +} Marek
Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64
On Tue, Jan 02, 2018 at 01:02:26PM -0700, Jeff Law wrote: > It's fairly obvious that the probe of *sp isn't actually necessary here > because the register saves in the prologue act as probe points for *sp. > > In fact, the only way this can ever cause problems is if %esi is used in > the body in which case it would have been callee saved in the prologue. > So if we detect that %esi is already callee saved in the prologue then > we could eliminate the explicit probe of *sp. > > But we can do even better. If any register is saved in the prologue, > then that callee register save functions as an implicit probe of *sp and > we do not need to explicitly probe *sp. > > While this was reported with -m32, I'm pretty sure we can trigger a > similar issue on x86_64. > > Bootstrapped and regression tested on x86_64. Also verified the > testcase behavior on -m32. The test uses flags to hopefully ensure > expected behavior on x86/Solaris, but I did not explicitly test that > configuration. > > OK for the trunk? > > Jeff > Missing PR target/83641 here. > * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not adjust as Florian reported. > explicitly probe *sp in a noreturn function if there were any callee > register saves. " or frame pointer is needed" ? > > * gcc.target/i386/stack-check-17.c: New test Missing full stop after New test This is a nice optimization, ok for trunk. I think we do not want to put anything into the CFI even if the condition you've added doesn't trigger, that is a pure space and runtime cycle waste, except for noting the sp change if it is the current CFA. Even after the push the register value still holds the caller's value, and after the pop too. ix86_emit_restore_reg_using_pop does a lot of stuff we IMNSHO don't want to do, if dummy_reg happened to be a drap reg, we'd emit really weird stuff, the cfa restore, etc. There are many other spots that gen_pop themselves and do the appropriate thing at that spot, instead of all using ix86_emit_restore_reg_using_pop. Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux? 2018-01-03 Jakub JelinekPR target/83641 * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop, only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp and add REG_CFA_ADJUST_CFA notes in that case to both insns. --- gcc/config/i386/i386.c.jj 2018-01-03 10:20:06.495535771 +0100 +++ gcc/config/i386/i386.c 2018-01-03 12:32:01.747649506 +0100 @@ -12229,9 +12229,21 @@ ix86_adjust_stack_and_probe_stack_clash argument passing registers so as not to introduce dependencies in the pipeline. For 32 bit we use %esi and for 64 bit we use %rax. */ rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG); - rtx_insn *insn = emit_insn (gen_push (dummy_reg)); - RTX_FRAME_RELATED_P (insn) = 1; - ix86_emit_restore_reg_using_pop (dummy_reg); + rtx_insn *insn_push = emit_insn (gen_push (dummy_reg)); + rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg)); + m->fs.sp_offset -= UNITS_PER_WORD; + if (m->fs.cfa_reg == stack_pointer_rtx) + { + m->fs.cfa_offset -= UNITS_PER_WORD; + rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_push) = 1; + x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD); + x = gen_rtx_SET (stack_pointer_rtx, x); + add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x); + RTX_FRAME_RELATED_P (insn_pop) = 1; + } emit_insn (gen_blockage ()); } Jakub
Re: [PATCH] PR 78534 Change character length from int to size_t
On Sat, Dec 30, 2017 at 10:58 PM, Jerry DeLislewrote: > On 12/30/2017 12:35 PM, Janne Blomqvist wrote: >> On Sat, Dec 30, 2017 at 7:16 PM, Thomas Koenig wrote: > ---snip--- >> >> I can provide that stuff as a separate patch, or merge it into the >> original megapatch and resubmit that, whichever way you prefer. > > I would prefer we split into two patches. This will make review of the library > I/O changes easier. The int len is used in a lot of places also where it > really > happens to also be the kind (which is a length in our implementation). Hi, attached is a patch that makes the two attached testcases work. It applies on top of the charlen->size_t patch. In the formatted I/O stuff, I have mostly used ptrdiff_t to avoid having to deal with signed/unsigned issues, as the previous code was using int. -- Janne Blomqvist From 72799c5587ee1e830bb424278286cff309d0c4a7 Mon Sep 17 00:00:00 2001 From: Janne Blomqvist Date: Tue, 2 Jan 2018 14:17:57 +0200 Subject: [PATCH] Use ptrdiff_t for formatted I/O sizes --- libgfortran/io/fbuf.c | 44 ++--- libgfortran/io/fbuf.h | 16 +-- libgfortran/io/io.h| 16 +-- libgfortran/io/list_read.c | 15 +- libgfortran/io/read.c | 70 +++--- libgfortran/io/transfer.c | 36 libgfortran/io/unix.c | 20 ++--- libgfortran/io/unix.h | 12 libgfortran/io/write.c | 21 +++--- 9 files changed, 126 insertions(+), 124 deletions(-) diff --git a/libgfortran/io/fbuf.c b/libgfortran/io/fbuf.c index 944469d..d38d003 100644 --- a/libgfortran/io/fbuf.c +++ b/libgfortran/io/fbuf.c @@ -33,7 +33,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see void -fbuf_init (gfc_unit *u, int len) +fbuf_init (gfc_unit *u, ptrdiff_t len) { if (len == 0) len = 512; /* Default size. */ @@ -64,9 +64,9 @@ fbuf_debug (gfc_unit *u, const char *format, ...) va_start(args, format); vfprintf(stderr, format, args); va_end(args); - fprintf (stderr, "fbuf_debug pos: %d, act: %d, buf: ''", - u->fbuf->pos, u->fbuf->act); - for (int ii = 0; ii < u->fbuf->act; ii++) + fprintf (stderr, "fbuf_debug pos: %ld, act: %ld, buf: ''", + (long) u->fbuf->pos, (long) u->fbuf->act); + for (ptrdiff_t ii = 0; ii < u->fbuf->act; ii++) { putc (u->fbuf->buf[ii], stderr); } @@ -84,10 +84,10 @@ fbuf_debug (gfc_unit *u __attribute__ ((unused)), underlying device. Returns how much the physical position was modified. */ -int +ptrdiff_t fbuf_reset (gfc_unit *u) { - int seekval = 0; + ptrdiff_t seekval = 0; if (!u->fbuf) return 0; @@ -99,7 +99,7 @@ fbuf_reset (gfc_unit *u) if (u->mode == READING && u->fbuf->act > u->fbuf->pos) { seekval = - (u->fbuf->act - u->fbuf->pos); - fbuf_debug (u, "fbuf_reset seekval %d, ", seekval); + fbuf_debug (u, "fbuf_reset seekval %ld, ", (long) seekval); } u->fbuf->act = u->fbuf->pos = 0; return seekval; @@ -111,11 +111,11 @@ fbuf_reset (gfc_unit *u) reallocating if necessary. */ char * -fbuf_alloc (gfc_unit *u, int len) +fbuf_alloc (gfc_unit *u, ptrdiff_t len) { - int newlen; + ptrdiff_t newlen; char *dest; - fbuf_debug (u, "fbuf_alloc len %d, ", len); + fbuf_debug (u, "fbuf_alloc len %ld, ", (long) len); if (u->fbuf->pos + len > u->fbuf->len) { /* Round up to nearest multiple of the current buffer length. */ @@ -138,7 +138,7 @@ fbuf_alloc (gfc_unit *u, int len) int fbuf_flush (gfc_unit *u, unit_mode mode) { - int nwritten; + ptrdiff_t nwritten; if (!u->fbuf) return 0; @@ -177,7 +177,7 @@ fbuf_flush (gfc_unit *u, unit_mode mode) int fbuf_flush_list (gfc_unit *u, unit_mode mode) { - int nwritten; + ptrdiff_t nwritten; if (!u->fbuf) return 0; @@ -206,8 +206,8 @@ fbuf_flush_list (gfc_unit *u, unit_mode mode) } -int -fbuf_seek (gfc_unit *u, int off, int whence) +ptrdiff_t +fbuf_seek (gfc_unit *u, ptrdiff_t off, int whence) { if (!u->fbuf) return -1; @@ -226,7 +226,7 @@ fbuf_seek (gfc_unit *u, int off, int whence) return -1; } - fbuf_debug (u, "fbuf_seek, off %d ", off); + fbuf_debug (u, "fbuf_seek, off %ld ", (long) off); /* The start of the buffer is always equal to the left tab limit. Moving to the left past the buffer is illegal in C and would also imply moving past the left tab limit, which is never @@ -248,21 +248,21 @@ fbuf_seek (gfc_unit *u, int off, int whence) of bytes actually processed. */ char * -fbuf_read (gfc_unit *u, int *len) +fbuf_read (gfc_unit *u, ptrdiff_t *len) { char *ptr; - int oldact, oldpos; - int readlen = 0; + ptrdiff_t oldact, oldpos; + ptrdiff_t readlen = 0; - fbuf_debug (u, "fbuf_read, len %d: ", *len); + fbuf_debug (u, "fbuf_read, len %ld: ", (long) *len);
Re: PR83648
> On Wed, Jan 03, 2018 at 10:05:30AM +0100, Richard Biener wrote: > > >One concern I have is that with the patch, malloc_candidate_p will > > >return true if all the args to PHI are NULL: > > >retval = PHI<0, 0> > > >return retval > > > > > >However I expect that PHI with all 0 args would be constant folded to > > >0 earlier, so this case shouldn't occur in practice ? > > > > You may not rely on folding for correctness. .. and at this level i would say even for code quality. Early optimizers are facing a lot of garbage and they are not repeated, so we get code at various intermediate levels of optimizations thorugh the IPA queue. Honza > > Yeah. Will the patch handle (I mean punt on) also unfolded > if (n) ? 0 : __builtin_malloc (n); > and similar? > > Jakub
[PATCH] Be careful about comdat boundary in ICF (PR ipa/82352).
Hi. This patch is follow-up of r246848. This time ICF creates an edge between 2 functions, where one is inside a comdat group and second is not. I've got patch that is conservative about the comdat groups (in_same_comdat_group_p). Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin gcc/ChangeLog: 2018-01-03 Martin LiskaPR ipa/82352 * ipa-icf.c (sem_function::merge): Do not cross comdat boundary. gcc/testsuite/ChangeLog: 2018-01-03 Martin Liska PR ipa/82352 * g++.dg/ipa/pr82352.C: New test. --- gcc/ipa-icf.c | 9 gcc/testsuite/g++.dg/ipa/pr82352.C | 93 ++ 2 files changed, 102 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ipa/pr82352.C diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index a8d3b800318..a56c7306201 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -1113,6 +1113,15 @@ sem_function::merge (sem_item *alias_item) return false; } + if (!original->in_same_comdat_group_p (alias)) +{ + if (dump_file) + fprintf (dump_file, "Not unifying; alias cannot be created; " + "across comdat group boundary\n\n"); + + return false; +} + /* See if original is in a section that can be discarded if the main symbol is not used. */ diff --git a/gcc/testsuite/g++.dg/ipa/pr82352.C b/gcc/testsuite/g++.dg/ipa/pr82352.C new file mode 100644 index 000..c044345a486 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr82352.C @@ -0,0 +1,93 @@ +// PR ipa/82352 +// { dg-do compile } +// { dg-options "-O2" } + +typedef long unsigned int size_t; + +class A +{ +public : + typedef enum { Zero = 0, One = 1 } tA; + A(tA a) { m_a = a; } + +private : + tA m_a; +}; + +class B +{ +public : + void *operator new(size_t t) { return (void*)(42); }; +}; + +class C +{ +public: + virtual void () = 0; +}; + +class D +{ + public : + virtual void g() = 0; + virtual void h() = 0; +}; + +template class : public T, public D +{ +public: + void () + { + if (!m_i2) throw A(A::One); + }; + + void h() + { + if (m_i2) throw A(A::Zero); + } + +protected: + virtual void g() + { + if (m_i1 !=0) throw A(A::Zero); + }; + +private : + int m_i1; + void *m_i2; +}; + +class E +{ +private: +size_t m_e; +static const size_t Max; + +public: +E& i(size_t a, size_t b, size_t c) +{ +if ((a > Max) || (c > Max)) throw A(A::Zero ); +if (a + b > m_e) throw A(A::One ); +return (*this); +} + + inline E& j(const E ) +{ + return i(0,0,s.m_e); +} +}; + +class F : public C { }; +class G : public C { }; +class : public B, public F, public G { }; + +void k() +{ +new (); +} + +void l() +{ + E e1, e2; + e1.j(e2); +}
Re: [patch, fortran] Fix PR 83664, missing check on boundary argument for eoshift
On Wed, Jan 3, 2018 at 1:13 PM, Thomas Koenigwrote: > Hello world, > > the attached patch fixes a missing check for eoshift. > > If you're wondering what the "else" belongs to - it is > > if (boundary != NULL) > > Regression-tested. OK for trunk? > > Regards > > Thomas > > 2018-01-03 Thomas Koenig > > PR fortran/83664 > * check.c (gfc_check_eoshift): Error for missing boundary if array > is not one of the standard types. > > 2018-01-03 Thomas Koenig > > PR fortran/83664 > * gfortran.dg/eoshift_7.f90: New test. Ok for trunk, thanks! -- Janne Blomqvist
[patch, fortran] Fix PR 83664, missing check on boundary argument for eoshift
Hello world, the attached patch fixes a missing check for eoshift. If you're wondering what the "else" belongs to - it is if (boundary != NULL) Regression-tested. OK for trunk? Regards Thomas 2018-01-03 Thomas KoenigPR fortran/83664 * check.c (gfc_check_eoshift): Error for missing boundary if array is not one of the standard types. 2018-01-03 Thomas Koenig PR fortran/83664 * gfortran.dg/eoshift_7.f90: New test. Index: check.c === --- check.c (Revision 255788) +++ check.c (Arbeitskopie) @@ -2270,6 +2270,26 @@ gfc_check_eoshift (gfc_expr *array, gfc_expr *shif return false; } } + else +{ + switch (array->ts.type) + { + case BT_INTEGER: + case BT_LOGICAL: + case BT_REAL: + case BT_COMPLEX: + case BT_CHARACTER: + break; + + default: + gfc_error ("Missing %qs argument to %qs intrinsic at %L for %qs " + "of type %qs", gfc_current_intrinsic_arg[2]->name, + gfc_current_intrinsic, >where, + gfc_current_intrinsic_arg[0]->name, + gfc_typename (>ts)); + return false; + } +} return true; } ! { dg-do compile } program main type t integer :: x end type t type(t), dimension(2) :: a, b a(1)%x = 1 a(2)%x = 2 b = eoshift(a,1) ! { dg-error "Missing 'boundary' argument to 'eoshift' intrinsic" } print *,b%x end program main
Re: [AARCH64] implements neon vld1_*_x2 intrinsics
Hi Kugan, On 15 November 2017 at 12:23, James Greenhalghwrote: > On Wed, Nov 15, 2017 at 09:58:28AM +, Kyrill Tkachov wrote: >> Hi Kugan, >> >> On 07/11/17 04:10, Kugan Vivekanandarajah wrote: >> > Hi, >> > >> > Attached patch implements the vld1_*_x2 intrinsics as defined by the >> > neon document. >> > >> > Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is >> > this OK for trunk if no regressions? >> > >> >> This looks mostly ok to me (though I cannot approve) modulo a couple of >> minor type issues below. > > Thanks for the review Kyrill! > > I'm happy to trust Kyrill's knowledge of the back-end here, so the patch > is OK with the changes Kyrill requested. > > Thanks for the patch! > > James > >> > gcc/ChangeLog: >> > >> > 2017-11-06 Kugan Vivekanandarajah >> > >> > * config/aarch64/aarch64-simd.md (aarch64_ld1x2): New. >> > (aarch64_ld1x2): Likewise. >> > (aarch64_simd_ld1_x2): Likewise. >> > (aarch64_simd_ld1_x2): Likewise. >> > * config/aarch64/arm_neon.h (vld1_u8_x2): New. >> > (vld1_s8_x2): Likewise. >> > (vld1_u16_x2): Likewise. >> > (vld1_s16_x2): Likewise. >> > (vld1_u32_x2): Likewise. >> > (vld1_s32_x2): Likewise. >> > (vld1_u64_x2): Likewise. >> > (vld1_s64_x2): Likewise. >> > (vld1_f16_x2): Likewise. >> > (vld1_f32_x2): Likewise. >> > (vld1_f64_x2): Likewise. >> > (vld1_p8_x2): Likewise. >> > (vld1_p16_x2): Likewise. >> > (vld1_p64_x2): Likewise. >> > (vld1q_u8_x2): Likewise. >> > (vld1q_s8_x2): Likewise. >> > (vld1q_u16_x2): Likewise. >> > (vld1q_s16_x2): Likewise. >> > (vld1q_u32_x2): Likewise. >> > (vld1q_s32_x2): Likewise. >> > (vld1q_u64_x2): Likewise. >> > (vld1q_s64_x2): Likewise. >> > (vld1q_f16_x2): Likewise. >> > (vld1q_f32_x2): Likewise. >> > (vld1q_f64_x2): Likewise. >> > (vld1q_p8_x2): Likewise. >> > (vld1q_p16_x2): Likewise. >> > (vld1q_p64_x2): Likewise. >> > >> > gcc/testsuite/ChangeLog: >> > >> > 2017-11-06 Kugan Vivekanandarajah >> > >> > * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: New test. >> Sorry for not seeing this before you committed this patch, but the new test fails to compile on arm targets. Can you add the proper guard, as there is in other tests in the same dir? Other question: why do you force -O3? The harness iterates on O0, O1, Thanks, Christophe >> +__extension__ extern __inline int8x8x2_t >> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> +vld1_s8_x2 (const uint8_t *__a) >> >> This should be "const int8_t *" >> >> +{ >> + int8x8x2_t ret; >> + __builtin_aarch64_simd_oi __o; >> + __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) >> __a); >> + ret.val[0] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0); >> + ret.val[1] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1); >> + return ret; >> +} >> >> ... >> >> +__extension__ extern __inline int32x2x2_t >> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> +vld1_s32_x2 (const uint32_t *__a) >> >> Likewise, this should be "const int32_t *" >> >> +{ >> + int32x2x2_t ret; >> + __builtin_aarch64_simd_oi __o; >> + __o = __builtin_aarch64_ld1x2v2si ((const __builtin_aarch64_simd_si *) >> __a); >> + ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0); >> + ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1); >> + return ret; >> +} >> + >> >>
[committed] Tweak update-copyright.py so that it doesn't fail on pdt_5.f03
Hi! This testcase contains: ! Third, complete example from the PGInsider article: ! "Object-Oriented Programming in Fortran 2003 Part 3: Parameterized Derived Types" ! by Mark Leair ! ! Copyright (c) 2013, NVIDIA CORPORATION. All rights reserved. ! ! NVIDIA CORPORATION and its licensors retain all intellectual property ! and proprietary rights in and to this software, related documentation ! and any modifications thereto. Any use, reproduction, disclosure or ! distribution of this software and related documentation without an express ! license agreement from NVIDIA CORPORATION is strictly prohibited. ! ! THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT ! WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT ! NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR ! FITNESS FOR A PARTICULAR PURPOSE. ! ! Note that modification had to be made all of which are commented. which makes me wonder if it shouldn't be removed. So that I could update the copyright years, I've committed following workaround for that, but if the testcase is removed or rewritten that can go away as well. 2018-01-03 Jakub Jelinek* update-copyright.py: Skip pdt-5.f03 in gfortran.dg subdir. --- contrib/update-copyright.py (revision 256167) +++ contrib/update-copyright.py (revision 256168) @@ -591,6 +591,8 @@ class TestsuiteFilter (GenericFilter): # Similarly params/README. if filename == 'README' and os.path.basename (dir) == 'params': return True +if filename == 'pdt_5.f03' and os.path.basename (dir) == 'gfortran.dg': + return True return GenericFilter.skip_file (self, dir, filename) class LibCppFilter (GenericFilter): Jakub
Re: [patch AArch64] Do not perform a vector splat for vector initialisation if it is not useful
On 19 December 2017 at 00:36, Jeff Lawwrote: > On 12/11/2017 08:44 AM, James Greenhalgh wrote: >> Hi, >> >> In the testcase in this patch we create an SLP vector with only two >> elements. Our current vector initialisation code will first duplicate >> the first element to both lanes, then overwrite the top lane with a new >> value. >> >> This duplication can be clunky and wasteful. >> >> Better would be to simply use the fact that we will always be overwriting >> the remaining bits, and simply move the first element to the corrcet place >> (implicitly zeroing all other bits). >> >> This reduces the code generation for this case, and can allow more >> efficient addressing modes, and other second order benefits for AArch64 >> code which has been vectorized to V2DI mode. >> >> Note that the change is generic enough to catch the case for any vector >> mode, but is expected to be most useful for 2x64-bit vectorization. >> >> Unfortunately, on its own, this would cause failures in >> gcc.target/aarch64/load_v2vec_lanes_1.c and >> gcc.target/aarch64/store_v2vec_lanes.c , which expect to see many more >> vec_merge and vec_duplicate for their simplifications to apply. To fix this, >> add a special case to the AArch64 code if we are loading from two memory >> addresses, and use the load_pair_lanes patterns directly. >> >> We also need a new pattern in simplify-rtx.c:simplify_ternary_operation , to >> catch: >> >> (vec_merge:OUTER >> (vec_duplicate:OUTER x:INNER) >> (subreg:OUTER y:INNER 0) >> (const_int N)) >> >> And simplify it to: >> >> (vec_concat:OUTER x:INNER y:INNER) or (vec_concat y x) >> >> This is similar to the existing patterns which are tested in this function, >> without requiring the second operand to also be a vec_duplicate. >> >> Bootstrapped and tested on aarch64-none-linux-gnu and tested on >> aarch64-none-elf. >> >> Note that this requires >> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00614.html >> if we don't want to ICE creating broken vector zero extends. >> >> Are the non-AArch64 parts OK? >> >> Thanks, >> James >> >> --- >> 2017-12-11 James Greenhalgh >> >> * config/aarch64/aarch64.c (aarch64_expand_vector_init): Modify code >> generation for cases where splatting a value is not useful. >> * simplify-rtx.c (simplify_ternary_operation): Simplify vec_merge >> across a vec_duplicate and a paradoxical subreg forming a vector >> mode to a vec_concat. >> >> 2017-12-11 James Greenhalgh >> >> * gcc.target/aarch64/vect-slp-dup.c: New. >> >> >> 0001-patch-AArch64-Do-not-perform-a-vector-splat-for-vect.patch >> >> > >> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c >> index 806c309..ed16f70 100644 >> --- a/gcc/simplify-rtx.c >> +++ b/gcc/simplify-rtx.c >> @@ -5785,6 +5785,36 @@ simplify_ternary_operation (enum rtx_code code, >> machine_mode mode, >> return simplify_gen_binary (VEC_CONCAT, mode, newop0, newop1); >> } >> >> + /* Replace: >> + >> + (vec_merge:outer (vec_duplicate:outer x:inner) >> +(subreg:outer y:inner 0) >> +(const_int N)) >> + >> + with (vec_concat:outer x:inner y:inner) if N == 1, >> + or (vec_concat:outer y:inner x:inner) if N == 2. >> + >> + Implicitly, this means we have a paradoxical subreg, but such >> + a check is cheap, so make it anyway. > I'm going to assume that N == 1 and N == 3 are handled elsewhere and do > not show up here in practice. > > > So is it advisable to handle the case where the VEC_DUPLICATE and SUBREG > show up in the opposite order? Or is there some canonicalization that > prevents that? > > simplify-rtx bits are OK as-is if we're certain we're not going to get > the alternate ordering of the VEC_MERGE operands. ALso OK if you either > generalize this chunk of code or duplicate & twiddle it to handle the > alternate order. > > I didn't look at the aarch64 specific bits. > Hi James, Your patch (r255946) introduced regressions on aarch64_be: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255946/report-build-info.html I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83663 to track this. Thanks, Christophe > jeff >
Re: [Patch, fortran] PR83076 - [8 Regression] ICE in gfc_deallocate_scalar_with_status, at fortran/trans.c:1598
Many thanks, Damian. I will commit soonish; probably tomorrow. Paul On 3 January 2018 at 00:22, Damian Rousonwrote: > I have now confirmed that the patch works the same for the 7 branch: it > doesn’t break any previously passing tests. > > Damian > > On January 1, 2018 at 9:44:59 AM, Paul Richard Thomas > (paul.richard.tho...@gmail.com) wrote: > > Committed to trunk as revision 256065. > > Damian, it would be good if you would confirm that there are no issues > with applying the patch to 7-branch. > > Thanks for all the help. > > Paul > > On 28 December 2017 at 19:58, Damian Rouson > wrote: >> I applied the patch the trunk and confirmed that it doesn’t break any >> previously >> passing OpenCoarrays tests. Is that sufficient or should I also try >> applying the >> patch to the 7 branch? >> >> Damian >> >> > > > > -- > "If you can't explain it simply, you don't understand it well enough" > - Albert Einstein -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
[committed] Update copyright years
Hi! Happy New Year to everyone! 2018-01-03 Jakub Jelinekgcc/ * gcc.c (process_command): Update copyright notice dates. * gcov-dump.c (print_version): Ditto. * gcov.c (print_version): Ditto. * gcov-tool.c (print_version): Ditto. * gengtype.c (create_file): Ditto. * doc/cpp.texi: Bump @copying's copyright year. * doc/cppinternals.texi: Ditto. * doc/gcc.texi: Ditto. * doc/gccint.texi: Ditto. * doc/gcov.texi: Ditto. * doc/install.texi: Ditto. * doc/invoke.texi: Ditto. gcc/fortran/ * gfortranspec.c (lang_specific_driver): Update copyright notice dates. * gfc-internals.texi: Bump @copying's copyright year. * gfortran.texi: Ditto. * intrinsic.texi: Ditto. * invoke.texi: Ditto. gcc/ada/ * gnat_ugn.texi: Bump @copying's copyright year. * gnat_rm.texi: Likewise. gcc/go/ * gccgo.texi: Bump @copyrights-go year. libitm/ * libitm.texi: Bump @copying's copyright year. libgomp/ * libgomp.texi: Bump @copying's copyright year. libquadmath/ * libquadmath.texi: Bump @copying's copyright year. --- libitm/libitm.texi (revision 243991) +++ libitm/libitm.texi (revision 243992) @@ -7,7 +7,7 @@ @copying -Copyright @copyright{} 2011-2017 Free Software Foundation, Inc. +Copyright @copyright{} 2011-2018 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or --- libgomp/libgomp.texi(revision 243991) +++ libgomp/libgomp.texi(revision 243992) @@ -7,7 +7,7 @@ @copying -Copyright @copyright{} 2006-2017 Free Software Foundation, Inc. +Copyright @copyright{} 2006-2018 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- libquadmath/libquadmath.texi(revision 243991) +++ libquadmath/libquadmath.texi(revision 243992) @@ -6,7 +6,7 @@ @c %**end of header @copying -Copyright @copyright{} 2010-2017 Free Software Foundation, Inc. +Copyright @copyright{} 2010-2018 Free Software Foundation, Inc. @quotation Permission is granted to copy, distribute and/or modify this document --- gcc/doc/cpp.texi(revision 243991) +++ gcc/doc/cpp.texi(revision 243992) @@ -10,7 +10,7 @@ @copying @c man begin COPYRIGHT -Copyright @copyright{} 1987-2017 Free Software Foundation, Inc. +Copyright @copyright{} 1987-2018 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/gcc.texi(revision 243991) +++ gcc/doc/gcc.texi(revision 243992) @@ -40,7 +40,7 @@ @c %**end of header @copying -Copyright @copyright{} 1988-2017 Free Software Foundation, Inc. +Copyright @copyright{} 1988-2018 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/cppinternals.texi (revision 243991) +++ gcc/doc/cppinternals.texi (revision 243992) @@ -18,7 +18,7 @@ @ifinfo This file documents the internals of the GNU C Preprocessor. -Copyright (C) 2000-2017 Free Software Foundation, Inc. +Copyright (C) 2000-2018 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice @@ -47,7 +47,7 @@ into another language, under the above c @page @vskip 0pt plus 1filll @c man begin COPYRIGHT -Copyright @copyright{} 2000-2017 Free Software Foundation, Inc. +Copyright @copyright{} 2000-2018 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice --- gcc/doc/gccint.texi (revision 243991) +++ gcc/doc/gccint.texi (revision 243992) @@ -26,7 +26,7 @@ @c %**end of header @copying -Copyright @copyright{} 1988-2017 Free Software Foundation, Inc. +Copyright @copyright{} 1988-2018 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/invoke.texi (revision 243991) +++ gcc/doc/invoke.texi (revision 243992) @@ -8,7 +8,7 @@ @c man end @c man begin COPYRIGHT -Copyright @copyright{} 1988-2017 Free Software Foundation, Inc. +Copyright @copyright{} 1988-2018 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or --- gcc/doc/gcov.texi (revision 243991) +++ gcc/doc/gcov.texi (revision 243992) @@ -4,7 +4,7 @@ @ignore @c man begin COPYRIGHT -Copyright
Re: PR83648
On Wed, Jan 03, 2018 at 10:05:30AM +0100, Richard Biener wrote: > >One concern I have is that with the patch, malloc_candidate_p will > >return true if all the args to PHI are NULL: > >retval = PHI<0, 0> > >return retval > > > >However I expect that PHI with all 0 args would be constant folded to > >0 earlier, so this case shouldn't occur in practice ? > > You may not rely on folding for correctness. Yeah. Will the patch handle (I mean punt on) also unfolded if (n) ? 0 : __builtin_malloc (n); and similar? Jakub
Re: PR83648
On January 3, 2018 7:03:26 AM GMT+01:00, Prathamesh Kulkarniwrote: >Hi, >malloc_candidate_p() in ipa-pure-const misses detecting that a >function is malloc-like if the return value is result of PHI and one >of the arguments of PHI is 0. >For example: > >void g(unsigned n) >{ > return (n) ? __builtin_malloc (n) : 0; >} > >The reason is that the following check: >if (TREE_CODE (arg) != SSA_NAME) > DUMP_AND_RETURN ("phi arg is not SSA_NAME.") > >fails for arg with constant value 0 and malloc_candidate_p returns >false. >The patch simply skips the arg if it equals null_pointer_node. >Does it look OK ? Please use integer_zerop instead of a literal compare. I'm not sure how to handle targets where address zero is valid (flag_no_delete_null_pointer_chekcs) >One concern I have is that with the patch, malloc_candidate_p will >return true if all the args to PHI are NULL: >retval = PHI<0, 0> >return retval > >However I expect that PHI with all 0 args would be constant folded to >0 earlier, so this case shouldn't occur in practice ? You may not rely on folding for correctness. >Bootstrapped+tested on x86_64-unknown-linux-gnu. >Cross-testing on arm*-*-* and aarch64*-*-* in progress. > >Thanks, >Prathamesh
[committed] [100.2/nnn] poly_int: vector_builder element count
This patch changes the number of elements in a vector being built by a vector_builder from unsigned int to poly_uint64. The case in which it isn't a constant is the one that motivated adding the vector encoding in the first place. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Also tested by comparing the before and after assembly output for at least one target per CPU directory. Committed as pre-approved by Jeff here: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01409.html . Thanks, Richard 2018-01-03 Richard Sandifordgcc/ * vector-builder.h (vector_builder::m_full_nelts): Change from unsigned int to poly_uint64. (vector_builder::full_nelts): Update prototype accordingly. (vector_builder::new_vector): Likewise. (vector_builder::encoded_full_vector_p): Handle polynomial full_nelts. (vector_builder::operator ==): Likewise. (vector_builder::finalize): Likewise. * int-vector-builder.h (int_vector_builder::int_vector_builder): Take the number of elements as a poly_uint64 rather than an unsigned int. * vec-perm-indices.h (vec_perm_indices::m_nelts_per_input): Change from unsigned int to poly_uint64. (vec_perm_indices::vec_perm_indices): Update prototype accordingly. (vec_perm_indices::new_vector): Likewise. (vec_perm_indices::length): Likewise. (vec_perm_indices::nelts_per_input): Likewise. (vec_perm_indices::input_nelts): Likewise. * vec-perm-indices.c (vec_perm_indices::new_vector): Take the number of elements per input as a poly_uint64 rather than an unsigned int. Use the original encoding for variable-length vectors, rather than clamping each individual element. For the second and subsequent elements in each pattern, clamp the step and base before clamping their sum. (vec_perm_indices::series_p): Handle polynomial element counts. (vec_perm_indices::all_in_range_p): Likewise. (vec_perm_indices_to_tree): Likewise. (vec_perm_indices_to_rtx): Likewise. * tree-vect-stmts.c (vect_gen_perm_mask_any): Likewise. * tree-vector-builder.c (tree_vector_builder::new_unary_operation) (tree_vector_builder::new_binary_operation): Handle polynomial element counts. Return false if we need to know the number of elements at compile time. * fold-const.c (fold_vec_perm): Punt if the number of elements isn't known at compile time. Index: gcc/vector-builder.h === --- gcc/vector-builder.h2018-01-03 08:46:30.347663443 + +++ gcc/vector-builder.h2018-01-03 08:54:24.909508898 + @@ -90,7 +90,7 @@ #define GCC_VECTOR_BUILDER_H public: vector_builder (); - unsigned int full_nelts () const { return m_full_nelts; } + poly_uint64 full_nelts () const { return m_full_nelts; } unsigned int npatterns () const { return m_npatterns; } unsigned int nelts_per_pattern () const { return m_nelts_per_pattern; } unsigned int encoded_nelts () const; @@ -103,7 +103,7 @@ #define GCC_VECTOR_BUILDER_H void finalize (); protected: - void new_vector (unsigned int, unsigned int, unsigned int); + void new_vector (poly_uint64, unsigned int, unsigned int); void reshape (unsigned int, unsigned int); bool repeating_sequence_p (unsigned int, unsigned int, unsigned int); bool stepped_sequence_p (unsigned int, unsigned int, unsigned int); @@ -115,7 +115,7 @@ #define GCC_VECTOR_BUILDER_H Derived *derived () { return static_cast (this); } const Derived *derived () const; - unsigned int m_full_nelts; + poly_uint64 m_full_nelts; unsigned int m_npatterns; unsigned int m_nelts_per_pattern; }; @@ -152,7 +152,7 @@ vector_builder ::encoded_nelt inline bool vector_builder ::encoded_full_vector_p () const { - return m_npatterns * m_nelts_per_pattern == m_full_nelts; + return known_eq (m_npatterns * m_nelts_per_pattern, m_full_nelts); } /* Start building a vector that has FULL_NELTS elements. Initially @@ -160,7 +160,7 @@ vector_builder ::encoded_full template void -vector_builder ::new_vector (unsigned int full_nelts, +vector_builder ::new_vector (poly_uint64 full_nelts, unsigned int npatterns, unsigned int nelts_per_pattern) { @@ -178,7 +178,7 @@ vector_builder ::new_vector ( bool vector_builder ::operator == (const Derived ) const { - if (m_full_nelts != other.m_full_nelts + if (maybe_ne (m_full_nelts, other.m_full_nelts) || m_npatterns != other.m_npatterns || m_nelts_per_pattern != other.m_nelts_per_pattern) return false; @@ -356,14 +356,16 @@ vector_builder ::finalize () { /* The encoding requires
[committed] [100.1/nnn] poly_int: vec_perm_indices element type
This patch changes the vec_perm_indices element type from HOST_WIDE_INT to poly_int64, so that it can represent indices into a variable-length vector. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Also tested by comparing the before and after assembly output for at least one target per CPU directory. Committed as pre-approved by Jeff here: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01409.html . Thanks, Richard 2018-01-03 Richard Sandifordgcc/ * vec-perm-indices.h (vec_perm_builder): Change element type from HOST_WIDE_INT to poly_int64. (vec_perm_indices::element_type): Update accordingly. (vec_perm_indices::clamp): Handle polynomial element_types. * vec-perm-indices.c (vec_perm_indices::series_p): Likewise. (vec_perm_indices::all_in_range_p): Likewise. (tree_to_vec_perm_builder): Check for poly_int64 trees rather than shwi trees. * vector-builder.h (vector_builder::stepped_sequence_p): Handle polynomial vec_perm_indices element types. * int-vector-builder.h (int_vector_builder::equal_p): Likewise. * fold-const.c (fold_vec_perm): Likewise. * optabs.c (shift_amt_for_vec_perm_mask): Likewise. * tree-vect-generic.c (lower_vec_perm): Likewise. * tree-vect-slp.c (vect_transform_slp_perm_load): Likewise. * config/aarch64/aarch64.c (aarch64_evpc_tbl): Cast d->perm element type to HOST_WIDE_INT. Index: gcc/vec-perm-indices.h === --- gcc/vec-perm-indices.h 2018-01-02 18:27:07.346639775 + +++ gcc/vec-perm-indices.h 2018-01-03 08:46:30.347663443 + @@ -25,7 +25,7 @@ #define GCC_VEC_PERN_INDICES_H 1 /* A vector_builder for building constant permutation vectors. The elements do not need to be clamped to a particular range of input elements. */ -typedef int_vector_builder vec_perm_builder; +typedef int_vector_builder vec_perm_builder; /* This class represents a constant permutation vector, such as that used as the final operand to a VEC_PERM_EXPR. @@ -49,7 +49,7 @@ typedef int_vector_builder= size) +if (!known_in_range_p (m_encoding[i], start, size)) return false; /* For stepped encodings, check the full range of the series. */ @@ -174,8 +174,11 @@ vec_perm_indices::all_in_range_p (elemen wide enough for overflow not to be a problem. */ element_type headroom_down = base1 - start; element_type headroom_up = size - headroom_down - 1; - if (headroom_up < step * step_nelts - && headroom_down < (limit - step) * step_nelts) + HOST_WIDE_INT diff; + if ((!step.is_constant () + || maybe_lt (headroom_up, diff * step_nelts)) + && (!(limit - step).is_constant () + || maybe_lt (headroom_down, diff * step_nelts))) return false; } } @@ -191,14 +194,14 @@ tree_to_vec_perm_builder (vec_perm_build { unsigned int encoded_nelts = vector_cst_encoded_nelts (cst); for (unsigned int i = 0; i < encoded_nelts; ++i) -if (!tree_fits_shwi_p (VECTOR_CST_ENCODED_ELT (cst, i))) +if (!tree_fits_poly_int64_p (VECTOR_CST_ENCODED_ELT (cst, i))) return false; builder->new_vector (TYPE_VECTOR_SUBPARTS (TREE_TYPE (cst)), VECTOR_CST_NPATTERNS (cst), VECTOR_CST_NELTS_PER_PATTERN (cst)); for (unsigned int i = 0; i < encoded_nelts; ++i) -builder->quick_push (tree_to_shwi (VECTOR_CST_ENCODED_ELT (cst, i))); +builder->quick_push (tree_to_poly_int64 (VECTOR_CST_ENCODED_ELT (cst, i))); return true; } Index: gcc/vector-builder.h === --- gcc/vector-builder.h2018-01-02 18:27:07.346639775 + +++ gcc/vector-builder.h2018-01-03 08:46:30.347663443 + @@ -284,7 +284,8 @@ vector_builder ::stepped_sequ || !derived ()->integral_p (elt3)) return false; - if (derived ()->step (elt1, elt2) != derived ()->step (elt2, elt3)) + if (maybe_ne (derived ()->step (elt1, elt2), + derived ()->step (elt2, elt3))) return false; if (!derived ()->can_elide_p (elt3)) Index: gcc/int-vector-builder.h === --- gcc/int-vector-builder.h2018-01-02 18:26:37.619823164 + +++ gcc/int-vector-builder.h2018-01-03 08:46:30.345663760 + @@ -66,7 +66,7 @@ int_vector_builder::int_vector_builde inline bool int_vector_builder::equal_p (T elt1, T elt2) const { - return elt1 == elt2; + return known_eq (elt1, elt2); } /* Return the value of element ELT2 minus the value of element ELT1. */ Index: gcc/fold-const.c === --- gcc/fold-const.c2018-01-03 07:17:07.401983352
Re: [PATCH] [PR target/83641] Fix incorrect CFI for stack clash protected noreturn function on x86/x86_64
On 01/02/2018 04:22 PM, Jeff Law wrote: > On 01/02/2018 03:05 PM, Florian Weimer wrote: >> On 01/02/2018 09:02 PM, Jeff Law wrote: >>> * config/i386/i386.c (ix86_adjut_stack_and_probe_stack_clash): Do not >> >> Typo: “adjut”. >> >>> explicitly probe *sp in a noreturn function if there were any callee >>> register saves. >> >> I recompiled glibc with this patch, and I can confirm it fixes the new >> glibc test: >> >> https://sourceware.org/ml/libc-alpha/2017-12/msg00987.html >> >> However, I would appreciate if it were possible to avoid emitting the >> .cfi_offset/.cfi_register annotations and only record the change of >> frame address. The other CFI notes aren't needed, and it would avoid >> reintroducing this bug if the way the prologue is constructed changes >> and the condition for emitting the probe is not completely correct anymore. > I'm not aware of a way to do that. I'm not even sure having the ability > to tell the CFI machinery to avoid that stuff is a good idea from a > design/implementation standpoint. So Jakub pointed out in IRC we can use a REG_FRAME_RELATED_EXPR note to control more precisely what cfis are emitted. But that introduces its own set of issues. I suspect we're more likely to muck up generating the note explicitly than we are to muck up the condition controlling whether or not we need the explicit *sp probe. The way we run into problems is if the CFI notes do not reflect reality -- for example, indicating a register is saved at a location where it isn't actually saved, indicating a register was restored to its entry value when that's not true throughout the function, or mucking up the CFA offset. The way to get into those first two states is if we have a redundant push/pop pair and its associated notes for the *sp explicit probe, particularly for %esi on x86 or %rax on x86_64. An additional comment seems warranted though. And testing that we don't have a cfi restore in the regression test also seems warranted. Consider those added :-) I'll come up with the exact comment text in the AM. While avoiding the cfi notes seems useful, I doubt it's really going to improve things in practice. In theory if someone completely rewrote the prologue code they could potentially introduce all kinds of problems, but I don't see how your suggested change would really help there either. Comments in the code as well as tests verifying the output seem to be the way to go here as well. Jeff