Re: [PATCH] enhance buffer overflow warnings (and c/53562)
Attached is an updated patch that works around the problem with the definition of the NOTE_DATA macro discussed below. I've raised bug 78174 for it and temporarily worked around it in the patch. I'll see if I can come up with a patch to fix the macro the "right way" but would prefer to do that separately. The NOTE_DATA macro is implemented in terms of the RTL_CHECK1 macro that will need to change and that macro is used in many others, so I would rather not mess around with those as part of this patch. I tested the updated patch with today's top of trunk on x86_64. The reason why I missed the problem in the previous patch is because I accidentally tested an earlier version of it. In following up on bug 78174 it occurred to me that the warning option added by the patch for the non-checking functions could perhaps be enhanced to specify the object size checking type (right now it hardcodes type 1 for all non-checking functions). That would make it possible to request looser checking for some functions (e.g., memset, for compatibility with Glibc's setting of zero even at _FORTIFY_SOURCE=2), and stricter checking for others (e.g., strcpy where Glibc uses type-1 with _FORTIFY_SOURCE=2). If this is something people would think would be worthwhile please let me know. Martin On 10/31/2016 10:44 AM, Martin Sebor wrote: On 10/31/2016 06:39 AM, Tobias Burnus wrote: Martin Sebor wrote: Attached is an updated patch that adds checks for excessive sizes and bounds (those in excess of SIZE_MAX / 2), and also enables the same checking for strcat and strncat). This version also fixes an issue with the interpretation of anti-ranges in the first patch. The improvements exposed two bugs in the regression tests. If I apply this patch to my local trunk - and try to bootstrap GCC, bootstrapping fails (on x86-64_gnu-linux) as following. I have not tried to figure out whether the warning (-Werror) makes sense or not. ../../gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’: ../../gcc/emit-rtl.c:3933:59: error: void* memset(void*, int, size_t) writing 8 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow] memset (_DATA (note), 0, sizeof (NOTE_DATA (note))); where NOTE_DATA is defined in rtl.h as /* Opaque data. */ #define NOTE_DATA(INSN) RTL_CHECKC1 (INSN, 3, NOTE) Thanks for trying it out! The patch bootstrapped and passed regression tests for me yesterday, also on x86_64, and today on powepc64le, but just now I reproduced the error with the top of today's trunk on x86_64. I think the error is justified because the call to memset in the make_note_raw function references a fourth element (rtx_note::rtx_insn::rtx_def::u.fld[3]) of what is just a single element array. After macro expansion, the memset call itself: memset (&((note)->u.fld[3]), 0, sizeof (((note)->u.fld[3]))); is within the bounds of the object pointed to by note but the index 3 is out of bounds for note->u.fld and so undefined. An unpatched GCC issues a similar error when the call to memset is replaced with __builtin___memset_chk like so: __builtin___memset_chk (&((note)->u.fld[3]), 0, sizeof (((note)->u.fld[3])), __builtin_object_size (&((note)->u.fld[3]), 1)); /src/gcc/53562/gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’: /src/gcc/53562/gcc/emit-rtl.c:3941:53: warning: call to void* __builtin___memset_chk(void*, int, long unsigned int, long unsigned int) will always overflow destination buffer The code can be made valid and the warning avoided by deriving the same address not from an invalid pointer/subscript but rather from a pointer to the beginning of the union itself and adding the offset of the fourth element, like so: char *p = (char*) &(note)->u.fld[0]; p += sizeof (((note)->u.fld[0])) * 3; memset (p, 0, sizeof *p); Unfortunately, because the invalid subscript is the result of the expansion of the RTL_CHECK1() macro fixing this isn't as straightforward as that. What really should be fixed is the macro itself. Until then, it can be worked (hacked would be a better term) around it by storing the result of _DATA (note) expression in a volatile pointer, like so: diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 8afcfbe..6dd9439 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -3930,7 +3930,8 @@ make_note_raw (enum insn_note subtype) INSN_UID (note) = cur_insn_uid++; NOTE_KIND (note) = subtype; BLOCK_FOR_INSN (note) = NULL; - memset (_DATA (note), 0, sizeof (NOTE_DATA (note))); + void* volatile p = _DATA (note); + memset (p, 0, sizeof (NOTE_DATA (note))); return note; } Martin PR c/53562 - Add -Werror= support for -D_FORTIFY_SOURCE / __builtin___memcpy_chk PR middle-end/78149 - missing warning on strncpy buffer overflow due to an excessive bound PR middle-end/78138 - missing warnings on buffer overflow with non-constant source length gcc/c-family/ChangeLog: 2016-10-30 Martin Sebor
Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add
On Oct 31, 2016, at 7:19 PM, Michael Meissnerwrote: > > One question is how many of the billions and billions (ok, 1,345) of the > rs6000 > built-ins would be improved by expanding them in gimple time rather than rtl? > Hundreds and hundreds of them. All of the basic operators, many of the memory operations, all of the dozens of flavors of things that are just permutes at heart. The loads and stores alone are a huge deal that we've seen cause problems in customer code. Bill
Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add
> On Oct 31, 2016, at 7:09 PM, Jakub Jelinekwrote: > > On Mon, Oct 31, 2016 at 05:28:42PM -0500, Bill Schmidt wrote: >> The PowerPC back end loses performance on vector intrinsics, because >> currently >> all of them are treated as calls throughout the middle-end phases and only >> expanded when they reach RTL. Our version of altivec.h currently defines the >> public names of overloaded functions (like vec_add) to be #defines for hidden >> functions (like __builtin_vec_add), which are recognized in the parser as >> requiring special back-end support. Tables in rs6000-c.c handle dispatch of >> the overloaded functions to specific function calls appropriate to the >> argument >> types. > > This doesn't look very nice. If all you care is that the builtins like > __builtin_altivec_vaddubm etc. that __builtin_vec_add overloads into fold > into generic vector operations under certain conditions, just fold those > into whatever you want in targetm.gimple_fold_builtin (gsi). > > Jakub > Ah, thanks, Jakub. I wasn't aware of that hook, and that sounds like the best approach. I had found previously how difficult it can be to expand some of these things during the parser hook, but if we can expand them early in GIMPLE that is probably much easier. I will look into it. "This doesn't look very nice" wins the understatement of the year award... I was getting increasingly unhappy the further I got into it. Thanks, Bill
Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add
On Mon, Oct 31, 2016 at 06:49:20PM -0500, Bill Schmidt wrote: > > > > On Oct 31, 2016, at 5:28 PM, Bill Schmidt> > wrote: > > > > The other way would be to require a specific option on the command line > > to use the new dispatch mechanism. When the option is present, we would > > predefine a macro such as __PPC_FAST_VECTOR__, which would then gate the > > usage in altivec.h and overload.h. Use of #pragma GCC target to change > > the availability of Altivec, VMX, P8-vector, etc. would also be disallowed > > when the option is present. This has the advantage of always generating > > correct code, at the cost of requiring a special option before anyone > > can leverage the benefits of early vector expansion. That's unfortunate, > > but I suspect it's the best we can do. > > Though I suppose we could require the option to turn off the new dispatch > mechanism, and document the change in gcc7/changes.html. A little irritating > for people already using the pragma support, but I really expect this > wouldn't affect > many people at all. I suspect we may find out how many people are using #pragma GCC target and altivec vector instructions if we break their code :-( Even with attribute target instead of pragma, you need to give appropriate error messages if the user calls a built-in that the current machine doesn't support. IIRC, C++ does not support #pragma GCC target nor the target attribute. One question is how many of the billions and billions (ok, 1,345) of the rs6000 built-ins would be improved by expanding them in gimple time rather than rtl? -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add
On Mon, Oct 31, 2016 at 05:28:42PM -0500, Bill Schmidt wrote: > The PowerPC back end loses performance on vector intrinsics, because currently > all of them are treated as calls throughout the middle-end phases and only > expanded when they reach RTL. Our version of altivec.h currently defines the > public names of overloaded functions (like vec_add) to be #defines for hidden > functions (like __builtin_vec_add), which are recognized in the parser as > requiring special back-end support. Tables in rs6000-c.c handle dispatch of > the overloaded functions to specific function calls appropriate to the > argument > types. This doesn't look very nice. If all you care is that the builtins like __builtin_altivec_vaddubm etc. that __builtin_vec_add overloads into fold into generic vector operations under certain conditions, just fold those into whatever you want in targetm.gimple_fold_builtin (gsi). Jakub
Re: [PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add
> > On Oct 31, 2016, at 5:28 PM, Bill Schmidtwrote: > > The other way would be to require a specific option on the command line > to use the new dispatch mechanism. When the option is present, we would > predefine a macro such as __PPC_FAST_VECTOR__, which would then gate the > usage in altivec.h and overload.h. Use of #pragma GCC target to change > the availability of Altivec, VMX, P8-vector, etc. would also be disallowed > when the option is present. This has the advantage of always generating > correct code, at the cost of requiring a special option before anyone > can leverage the benefits of early vector expansion. That's unfortunate, > but I suspect it's the best we can do. Though I suppose we could require the option to turn off the new dispatch mechanism, and document the change in gcc7/changes.html. A little irritating for people already using the pragma support, but I really expect this wouldn't affect many people at all. -- Bill Bill Schmidt, Ph.D. GCC for Linux on Power Linux on Power Toolchain IBM Linux Technology Center wschm...@linux.vnet.ibm.com
Re: [PATCH], Optimize (double)vec_extract(n) on PowerPC
On Mon, Oct 31, 2016 at 06:04:47PM -0500, Segher Boessenkool wrote: > On Mon, Oct 31, 2016 at 05:25:56PM -0400, Michael Meissner wrote: > > +(define_insn "vsx_xvcvsxwdp_df" > > + [(set (match_operand:DF 0 "vsx_register_operand" "=ws") > > + (unspec:DF [(match_operand:V4SI 1 "vsx_register_operand" "wa")] > > + UNSPEC_VSX_CVSXWDP))] > > + "VECTOR_UNIT_VSX_P (V2DFmode)" > > + "xvcvsxwdp %x0,%x1" > > + [(set_attr "type" "vecdouble")]) > > I think the condition here is a pasto? It probably works, but it isn't > exactly clear what it means here. I will change it to TARGET_VSX, since it needs the XVCVSXWDP instruction to be defined. > > +;; Opimize f = () vec_extract (vi, ) > > Typo (optimize). Thanks. > > +;; where is a hardware supported binary floating point type > > +;; that is not double > > Full stop. Ok. > > Looks fine otherwise. Please apply with those things taking care of. > > Thanks, > > > Segher > -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH], Optimize (double)vec_extract(n) on PowerPC
On Mon, Oct 31, 2016 at 05:25:56PM -0400, Michael Meissner wrote: > +(define_insn "vsx_xvcvsxwdp_df" > + [(set (match_operand:DF 0 "vsx_register_operand" "=ws") > + (unspec:DF [(match_operand:V4SI 1 "vsx_register_operand" "wa")] > +UNSPEC_VSX_CVSXWDP))] > + "VECTOR_UNIT_VSX_P (V2DFmode)" > + "xvcvsxwdp %x0,%x1" > + [(set_attr "type" "vecdouble")]) I think the condition here is a pasto? It probably works, but it isn't exactly clear what it means here. > +;; Opimize f = () vec_extract (vi, ) Typo (optimize). > +;; where is a hardware supported binary floating point type > +;; that is not double Full stop. Looks fine otherwise. Please apply with those things taking care of. Thanks, Segher
[PATCH, RFC, rs6000] Add overloaded built-in function support to altivec.h, and re-implement vec_add
Hi, The PowerPC back end loses performance on vector intrinsics, because currently all of them are treated as calls throughout the middle-end phases and only expanded when they reach RTL. Our version of altivec.h currently defines the public names of overloaded functions (like vec_add) to be #defines for hidden functions (like __builtin_vec_add), which are recognized in the parser as requiring special back-end support. Tables in rs6000-c.c handle dispatch of the overloaded functions to specific function calls appropriate to the argument types. The Clang version of altivec.h, by contrast, creates static inlines for each overloaded function variant, relying on a special __attribute__((overloadable)) construct to do the dispatch in the parser itself. This allows vec_add to be immediately translated into type-specific addition during parsing, allowing the expressions to be subject to all subsequent optimization. We have opened a PR suggesting that this attribute be supported in GCC as well (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71199), but so far there hasn't been any success in that regard. While waiting/hoping for the attribute to be implemented, though, we can use existing mechanisms to create a poor man's version of overloading dispatch. This patch is a proof of concept for how this can be done, and provides support for early expansion of the overloaded vec_add intrinsic. If we get this working, then we can gradually add more intrinsics over time. The dispatch mechanism is provided in a new header file, overload.h, which is included in altivec.h. This is done because the guts of the dispatch mechanism are pretty ugly to look at. Overloading is done with a chain of calls to __builtin_choose_expr and __builtin_types_compatible_p. Currently I envision providing a separate dispatch macro for each combination of the number of arguments and the number of variants to be distinguished. I also provide a separate "decl" macro for each number of arguments, used to create the function decls for each static inline function. The add_vec intrinsic takes two input arguments and has 28 variants, so it requires the definition of OVERLOAD_2ARG_28VAR and OVERLOAD_2ARG_DECL in overload.h. These macros are then instantiated in altivec.h. The dispatch macro for an overloaded intrinsic is instantiated once, and the decl macro is instantiated once for each variant, along with the associated inline function body. The dispatch macro may need to vary depending on the supported processor features. In the vec_add example, we have some variants that support the "vector double" and "vector long long" data types. These only exist when VSX code generation is supported, so a dispatch table conditioned on __VSX__ includes these, while a separate one without VSX support does not. Similarly, __POWER8_VECTOR__ must be defined if we are to support "vector signed/unsigned __int128". Because we use a numbering scheme that needs to be kept consistent, this requires three versions of the dispatch table, where the more restrictive versions replace the unimplemented entries with redundant entries. Note that if and when we get an overloadable attribute in GCC, the stuff in overload.h will become obsolete, we will remove the dispatch instantiations, and we will replace the decl instantiations with plain decls with the overloadable attribute applied. There are several complications on top of the basic design: * When compiling for C++, the dispatch mechanism is not available, and indeed is not necessary. Thus for C++ we skip the dispatch mechanism, and change the definition of OVERLOAD_2ARG_DECL to use standard function overloading. * Compiling with -ansi or -std=c11 or the like means the dispatch mechanism is unavailable even for C, since GNU extensions are disallowed. Regret- tably, this means that we can't get rid of the existing late-expansion methods altogether. I don't see any way to avoid this. Note that this would be the case even if we had __attribute__ ((overloadable)), since that would also be a GNU extension. Despite the mess, I think that the performance improvements for non-strict-ANSI code make the dual maintenance worthwhile. * "#pragma GCC target" is going to cause a lot of trouble. With the patch in its present state, we fail gcc.target/powerpc/ppc-target-4.c, which tests the use of "vsx", "altivec,no-vsx", and "no-altivec" target options, and happens to use vec_add (float, float) as a testbed. The test fails because altivec.h is #included after the #pragma GCC target("vsx"), which allows the interfaces involving vector long long and vector double to be produced. However, when the options are changed to "altivec,no-vsx", the subsequent invocation of vec_add expands to a dispatch sequence including vector long long, leading to a compile-time error. I can only think of two ways to deal with this, neither of which is attractive. The
[PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834)
Hi! Some automatic VAR_DECLs don't get DECL_RTL set - e.g. if its SSA_NAMEs expand to multiple rtls, then there is not a single one that can be used. Using DECL_RTL on such VAR_DECLs ICEs. I've tried to just return 0 in nonoverlapping_memrefs_p if either DECL_RTL_SET_P (expr{x,y}) wasn't true, but that triggered way too often during bootstrap/regtest (3800+ times). So the following patch narrows it down more and triggers only on the testcase below. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-10-31 Jakub JelinekPR target/77834 * alias.c (nonoverlapping_memrefs_p): Return 0 if exprx or expry doesn't have rtl set and it can't be safely created. * gcc.dg/pr77834.c: New test. --- gcc/alias.c.jj 2016-10-21 17:06:27.0 +0200 +++ gcc/alias.c 2016-10-31 11:38:29.448031590 +0100 @@ -2755,6 +2755,27 @@ nonoverlapping_memrefs_p (const_rtx x, c || TREE_CODE (expry) == CONST_DECL) return 1; + /* Don't try to create RTL for decls that intentionally don't have + DECL_RTL set (e.g. marked as living in multiple places). */ + if (!DECL_RTL_SET_P (exprx)) +{ + if (TREE_CODE (exprx) == PARM_DECL + || TREE_CODE (exprx) == RESULT_DECL + || (VAR_P (exprx) + && !TREE_STATIC (exprx) + && !DECL_EXTERNAL (exprx))) + return 0; +} + if (!DECL_RTL_SET_P (expry)) +{ + if (TREE_CODE (expry) == PARM_DECL + || TREE_CODE (expry) == RESULT_DECL + || (VAR_P (expry) + && !TREE_STATIC (expry) + && !DECL_EXTERNAL (expry))) + return 0; +} + rtlx = DECL_RTL (exprx); rtly = DECL_RTL (expry); --- gcc/testsuite/gcc.dg/pr77834.c.jj 2016-10-31 11:41:46.290521464 +0100 +++ gcc/testsuite/gcc.dg/pr77834.c 2016-10-31 11:41:24.0 +0100 @@ -0,0 +1,18 @@ +/* PR target/77834 */ +/* { dg-do compile } */ +/* { dg-options "-O -ftree-pre -Wno-psabi" } */ +/* { dg-additional-options "-mstringop-strategy=libcall" { target i?86-*-* x86_64-*-* } } */ + +typedef int V __attribute__ ((vector_size (64))); + +V +foo (V u, V v, int w) +{ + do +{ + if (u[0]) v ^= u[w]; +} + while ((V) { 0, u[w] }[1]); + u = (V) { v[v[0]], u[u[0]] }; + return v + u; +} Jakub
[PATCH], Optimize (double)vec_extract(n) on PowerPC
This patch adds an optimization that I noticed towards the end of working on my previous patch to allow SImode into vector registers. This patch adds optimization when you are extracting an element using a 32-bit integer with constant element number, and then converting the value to double. While I was doing the patch, I noticed the SImode extract pattern could be optimized on power8 for better extract when SImode is allowed in vector registers. For example, consider the code: #include double vi_to_df (vector int v) { return (double)vec_extract (v, 0); } The current trunk compiler generates (on a Power8 little endian system): vspltw 0,2,3 mfvsrwz 9,32 mtvsrwa 1,9 fcfid 1,1 With these patches it generates: vspltw 2,2,3 xvcvsxwdp 1,34 I also added support for conversions to SFmode, TFmode, IFmode, and on ISA 3.0, KFmode using a DFmode temporary. I did a bootstrap build and there were no regressions with these patches. In addition, I built spec 2006 with/without these patches. In general, there was no change. The 416.gamess benchmark generated a few less xxlor's used as a move instruction due to the improvements in the SImode extraction and when 464.h264ref is auto vectorized, one of the extracts converted to floating point and generated the new code. However, neither change affected the performance of those two benchmarks. Can I check this into the trunk? [gcc] 2016-10-31 Michael Meissner* config/rs6000/vsx.md (VSX_EXTRACT_FL): New iterator for all binary floating point types supported by the hardware except for double. (vsx_xvcvsxwdp_df): Provide scalar result alternative to the vector instruction for optimizing extracting a SImode from a V4SImode vector and converting it to floating point. (vsx_xvcvuxwdp_df): Likewise. (vsx_extract_si): On ISA 3.0, allow extract target and temporary registers to be any VSX register. Move stores to the end of the constraints. (vsx_extract_si_float_df): New combiner pattern and splitter to optimize extracting a SImode from a V4SImode vector and converting it to a binary floating point type supported by the hardware. Use the vector converts instead of extracting the element, sign extending it, and then converting it to double. Other floating point types than double first convert to double, then the double is converted to that type. (vsx_extract_si_float_): Likewise. [gcc/testsuite] 2016-10-31 Michael Meissner * gcc.target/powerpc/vsx-extract-4.c: New test. * gcc.target/powerpc/vsx-extract-5.c: Likewise. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/vsx.md === --- gcc/config/rs6000/vsx.md (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 241631) +++ gcc/config/rs6000/vsx.md(.../gcc/config/rs6000) (working copy) @@ -288,6 +288,16 @@ (define_mode_attr VSX_EX [(V16QI "v") (V8HI "v") (V4SI "wa")]) +;; Mode iterator for binary floating types other than double to +;; optimize convert to that floating point type from an extract +;; of an integer type +(define_mode_iterator VSX_EXTRACT_FL [SF + (IF "FLOAT128_2REG_P (IFmode)") + (KF "TARGET_FLOAT128_HW") + (TF "FLOAT128_2REG_P (TFmode) + || (FLOAT128_IEEE_P (TFmode) + && TARGET_FLOAT128_HW)")]) + ;; Iterator for the 2 short vector types to do a splat from an integer (define_mode_iterator VSX_SPLAT_I [V16QI V8HI]) @@ -1907,6 +1917,7 @@ (define_insn "vsx_xvcvuxdsp" [(set_attr "type" "vecdouble")]) ;; Convert from 32-bit to 64-bit types +;; Provide both vector and scalar targets (define_insn "vsx_xvcvsxwdp" [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") (unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")] @@ -1915,6 +1926,14 @@ (define_insn "vsx_xvcvsxwdp" "xvcvsxwdp %x0,%x1" [(set_attr "type" "vecdouble")]) +(define_insn "vsx_xvcvsxwdp_df" + [(set (match_operand:DF 0 "vsx_register_operand" "=ws") + (unspec:DF [(match_operand:V4SI 1 "vsx_register_operand" "wa")] + UNSPEC_VSX_CVSXWDP))] + "VECTOR_UNIT_VSX_P (V2DFmode)" + "xvcvsxwdp %x0,%x1" + [(set_attr "type" "vecdouble")]) + (define_insn "vsx_xvcvuxwdp" [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa") (unspec:V2DF [(match_operand:V4SI 1 "vsx_register_operand" "wf,wa")] @@ -1923,6 +1942,14 @@
Re: [PATCH] DWARF5 .debug_rnglists support
On Mon, Oct 31, 2016 at 3:33 PM, Jakub Jelinekwrote: > On Mon, Oct 31, 2016 at 02:42:15PM -0400, Jason Merrill wrote: >> On 10/20/2016 02:52 PM, Jakub Jelinek wrote: >> >@@ -8476,7 +8498,16 @@ size_of_die (dw_die_ref die) >> > size += DWARF_OFFSET_SIZE; >> > break; >> > case dw_val_class_range_list: >> >- size += DWARF_OFFSET_SIZE; >> >+ if (dwarf_split_debug_info >> >+ && dwarf_version >= 5 >> >+ && a->dw_attr_val.val_entry != RELOCATED_OFFSET) >> >> This test, here and in value_format, should be factored out into a separate >> function with a comment explaining why you check dwarf_split_debug_info. > > Ok. Another option would be to call value_format in size_of_die > in this case and put the comment into value_format. That works, too. >> In general there's a lot of code duplication between the existing ranges >> support and the new rnglist support; even the new vector is a superset of >> the old one. Why duplicate the code rather than modify it? > > The point was to conserve memory for the -gdwarf-{2,3,4} case. > The old table needs just 4 bytes per entry, the new one 16 bytes per entry. > The code duplication because of that is mainly in > add_high_low_attributes - 24 lines duplicated. I was also thinking of the output_rnglists function. > At least while -gdwarf-4 is the default that looked to me like acceptable > cost, but if you disagree, I can surely try to just grow the original > table and use it for all versions. Please. I think if we're going in this direction we should just go ahead with it; if the memory consumption is going to be a problem it would be good to find that out so we can address it. Jason
[PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else.
The attached patch does a little change in combine.c:combine_simplify_rtx() to prevent a "simplification" where the rtl code gets more complex in reality. The complete description of the change can be found in the commit comment in the attached patch. The patch reduces the number of patterns in the s390 backend and slightly reduces the size of the compiled SPEC2006 code. (Code size or runtime only tested on s390x with -m64.) It is theoretically possible that this patch leads to somewhat worse code on some target if that only has a pattern for the formerly replaced rtl expression but not for the original one. The patch has passed the testsuite on s390, s390x biarch, x86_64 and Power biarch. -- (I'm not sure whether the const_int expression can appear in both operands or only as the second. If the latter is the case, the conditions can be simplified a bit.) What do you think about this patch? Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany gcc/ChangeLog * combine.c (if_then_else_cond): Suppress replacement of "(and (reg) (const_int bit))" with "if_then_else". >From bfa15721c760fa4cd9003050c3137cba3165139f Mon Sep 17 00:00:00 2001 From: Dominik VogtDate: Mon, 31 Oct 2016 09:00:31 +0100 Subject: [PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else. combine_simplify_rtx() tries to replace rtx expressions with just two possible values with an experession that uses if_then_else: (if_then_else (condition) (value1) (value2)) If the original expression is e.g. (and (reg) (const_int 2)) where the constant is the mask for a single bit, the replacement results in a more complex expression than before: (if_then_else (ne (zero_extract (reg) (1) (31))) (2) (0)) Similar replacements are done for (signextend (and ...)) (zeroextend (and ...)) Suppress the replacement this special case in if_then_else_cond(). --- gcc/combine.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/gcc/combine.c b/gcc/combine.c index b22a274..244669d 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -9103,9 +9103,25 @@ if_then_else_cond (rtx x, rtx *ptrue, rtx *pfalse) return x; } - /* Likewise for 0 or a single bit. */ + /* Likewise for 0 or a single bit. + If the operation is an AND (possibly wrapped in a SIGN_EXTEND or + ZERO_EXTEND) with either operand being just a constant single bit value, + do nothing since IF_THEN_ELSE is likely to increase the expression's + complexity. */ else if (HWI_COMPUTABLE_MODE_P (mode) - && pow2p_hwi (nz = nonzero_bits (x, mode))) + && pow2p_hwi (nz = nonzero_bits (x, mode)) + && ! (code == AND +&& ((CONST_INT_P (XEXP (x, 0)) + && UINTVAL (XEXP (x, 0)) == nz) +|| (CONST_INT_P (XEXP (x, 1)) +&& UINTVAL (XEXP (x, 1)) == nz))) + && ! ((code == SIGN_EXTEND || code == ZERO_EXTEND) +&& GET_CODE (XEXP (x, 0)) == AND +&& ((CONST_INT_P (XEXP (XEXP (x, 0), 0)) + && UINTVAL (XEXP (XEXP (x, 0), 0)) == nz) +|| (CONST_INT_P (XEXP (XEXP (x, 0), 1)) +&& UINTVAL (XEXP (XEXP (x, 0), 1)) == nz))) + ) { *ptrue = gen_int_mode (nz, mode), *pfalse = const0_rtx; return x; -- 2.3.0
Re: [gcc] Enable DW_OP_VAL_EXPRESSION support in dwarf module
On 10/21/2016 04:30 AM, Jiong Wang wrote: All DW_OP_* of the expression are grouped together inside the PARALLEL, and those operations which don't have RTL mapping are wrapped by UNSPEC. The parsing algorithm is simply something like: foreach elem inside PARALLEL if (UNSPEC) { dw_op_code = INTVAL (XVECEXP (elem, 0, 0)); oprnd1 = INTVAL (XVECEXP (elem, 0, 1)); oprnd2 = INTVAL (XVECEXP (elem, 0, 2)); } else call standard RTL parser. Any comments on the approach? If you're going to use UNSPEC, why not put the DWARF operator in the second operand? Jason
Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type
On Mon, Oct 31, 2016 at 3:38 PM, Jakub Jelinekwrote: > On Mon, Oct 31, 2016 at 03:33:22PM -0400, Jason Merrill wrote: >> OK. > > This patch needs the incremental > http://gcc.gnu.org/ml/gcc-patches/2016-10/msg02440.html > patch (now successfully bootstrapped/regtested on x86_64-linux and > i686-linux). Is that one ok too? I meant to imply that by replying to that message. :) Yes, both are OK. Jason
Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type
On Mon, Oct 31, 2016 at 03:33:22PM -0400, Jason Merrill wrote: > OK. This patch needs the incremental http://gcc.gnu.org/ml/gcc-patches/2016-10/msg02440.html patch (now successfully bootstrapped/regtested on x86_64-linux and i686-linux). Is that one ok too? I think it would be better to commit those two in a single commit. Jakub
Re: [PATCH] DWARF5 .debug_rnglists support
On Mon, Oct 31, 2016 at 02:42:15PM -0400, Jason Merrill wrote: > On 10/20/2016 02:52 PM, Jakub Jelinek wrote: > >@@ -8476,7 +8498,16 @@ size_of_die (dw_die_ref die) > > size += DWARF_OFFSET_SIZE; > > break; > > case dw_val_class_range_list: > >- size += DWARF_OFFSET_SIZE; > >+ if (dwarf_split_debug_info > >+ && dwarf_version >= 5 > >+ && a->dw_attr_val.val_entry != RELOCATED_OFFSET) > > This test, here and in value_format, should be factored out into a separate > function with a comment explaining why you check dwarf_split_debug_info. Ok. Another option would be to call value_format in size_of_die in this case and put the comment into value_format. > In general there's a lot of code duplication between the existing ranges > support and the new rnglist support; even the new vector is a superset of > the old one. Why duplicate the code rather than modify it? The point was to conserve memory for the -gdwarf-{2,3,4} case. The old table needs just 4 bytes per entry, the new one 16 bytes per entry. The code duplication because of that is mainly in add_high_low_attributes - 24 lines duplicated. At least while -gdwarf-4 is the default that looked to me like acceptable cost, but if you disagree, I can surely try to just grow the original table and use it for all versions. Jakub
Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type
OK.
Re: [PATCH] DWARF5 - Emit DW_AT_rank and DW_TAG_generic_subrange for assumed-rank arrays
OK.
Re: [PATCH] DWARF5 .debug_line{,_str} support
On 10/31/2016 02:38 PM, Jakub Jelinek wrote: On Mon, Oct 31, 2016 at 11:25:26AM -0400, Jason Merrill wrote: On 10/19/2016 07:30 PM, Jakub Jelinek wrote: This patch adds support for DWARF5 .debug_line{,_str} section format, though only if !DWARF2_ASM_LINE_DEBUG_INFO, because otherwise .debug_line is emitted by the assembler. For that we'll need some coordination with gas, dunno if we want a new as directive for that, or as command line switch, or perhaps key DWARF5 .debug_line on the presence of .debug_line_str section (though the last one probably isn't a good idea, because -gsplit-dwarf doesn't use .debug_line_str). Could gas pay attention to the DWARF version in a unit header? Maybe. Guess it needs to be decided in binutils and then we adjust gcc if any changes are needed. + if (ndirs + idx_offset <= 256) + idx_form = DW_FORM_data1; + else if (ndirs + idx_offset <= 65536) + { + unsigned HOST_WIDE_INT sum = 1; + for (i = 0; i < numfiles; i++) + { + int file_idx = backmap[i]; + int dir_idx = dirs[files[file_idx].dir_idx].dir_idx; + sum += size_of_uleb128 (dir_idx); + } + if (sum >= HOST_WIDE_INT_UC (2) * (numfiles + 1)) + idx_form = DW_FORM_data2; + } Why can we choose DW_FORM_data1 based only on ndirs+idx_offset, but we need to look at the files table to choose DW_FORM_data2? This at least needs a comment. I'll add a comment. The reason is that DW_FORM_data1, if it can be used, is always more compact than (or as compact as) DW_FORM_udata. While the choice in between DW_FORM_data2 and DW_FORM_udata if we have 256 to 65535 directories needs to be decided by computing both the size it would take for all the uleb128s and compare that to the size it takes to encode all as data2. + dw2_asm_output_data (idx_form == DW_FORM_data1 ? 1 : 2, +0, NULL); It seems that we could use an output_data function that takes a DW_FORM as one of its arguments and does the appropriate thing. Had a brief look at this, but it seems hard without big changes in dwarf2asm.[ch] or lots of duplication. The problem is that dw2_asm_output_data* are varargs functions, so if we add some wrapper that picks what to call based on a enum dwarf_form argument, we'd need to convert all of them to take va_list instead, and add various further wrappers. Is this ok to leave this part out? + if (dwarf_version >= 5 && str_form == DW_FORM_line_strp) + { + node = find_AT_string_in_table (filename0, debug_line_str_hash); + set_indirect_string (node); + node->form = DW_FORM_line_strp; + dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, +debug_line_str_section, +"File Entry: %#x: \"%s\"", 0, node->str); + } + else + dw2_asm_output_nstring (filename0, -1, "File Entry: %#x", 0); Please factor this out into a separate function. So like static void output_line_string (enum dwarf_form form, const char *str, const char *entry_kind, unsigned int idx) in the patch below? + if (!DWARF2_ASM_LINE_DEBUG_INFO + && dwarf_version >= 5 + && !(DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET + || (DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) == 0 + /* FIXME: For now. */ + || dwarf_split_debug_info)) This should also be a separate function or macro, since it's used more than once. And please elaborate the FIXME comment. Like DWARF5_USE_DEBUG_LINE_STR macro in the patch below? For -gsplit-dwarf, the problem is I'm really not familiar with what tools are used post assembly for this (and likely the tools need adjustment for DWARF5 anyway), so what is in the patches is just partial support what appeared to be easily done by myself. For the rest I've been hoping Cary or somebody else could finish it up. OK. Jason
Re: [PATCH] DWARF5 .debug_rnglists support
On 10/20/2016 02:52 PM, Jakub Jelinek wrote: @@ -8476,7 +8498,16 @@ size_of_die (dw_die_ref die) size += DWARF_OFFSET_SIZE; break; case dw_val_class_range_list: - size += DWARF_OFFSET_SIZE; + if (dwarf_split_debug_info + && dwarf_version >= 5 + && a->dw_attr_val.val_entry != RELOCATED_OFFSET) This test, here and in value_format, should be factored out into a separate function with a comment explaining why you check dwarf_split_debug_info. In general there's a lot of code duplication between the existing ranges support and the new rnglist support; even the new vector is a superset of the old one. Why duplicate the code rather than modify it? Jason
Re: [PATCH] DWARF5 .debug_line{,_str} support
On Mon, Oct 31, 2016 at 11:25:26AM -0400, Jason Merrill wrote: > On 10/19/2016 07:30 PM, Jakub Jelinek wrote: > >This patch adds support for DWARF5 .debug_line{,_str} section format, > >though only if !DWARF2_ASM_LINE_DEBUG_INFO, because otherwise > >.debug_line is emitted by the assembler. For that we'll need some > >coordination with gas, dunno if we want a new as directive for that, > >or as command line switch, or perhaps key DWARF5 .debug_line on the > >presence of .debug_line_str section (though the last one probably isn't a > >good idea, because -gsplit-dwarf doesn't use .debug_line_str). > > Could gas pay attention to the DWARF version in a unit header? Maybe. Guess it needs to be decided in binutils and then we adjust gcc if any changes are needed. > >+ if (ndirs + idx_offset <= 256) > >+idx_form = DW_FORM_data1; > >+ else if (ndirs + idx_offset <= 65536) > >+{ > >+ unsigned HOST_WIDE_INT sum = 1; > >+ for (i = 0; i < numfiles; i++) > >+{ > >+ int file_idx = backmap[i]; > >+ int dir_idx = dirs[files[file_idx].dir_idx].dir_idx; > >+ sum += size_of_uleb128 (dir_idx); > >+} > >+ if (sum >= HOST_WIDE_INT_UC (2) * (numfiles + 1)) > >+idx_form = DW_FORM_data2; > >+} > > Why can we choose DW_FORM_data1 based only on ndirs+idx_offset, but we need > to look at the files table to choose DW_FORM_data2? This at least needs a > comment. I'll add a comment. The reason is that DW_FORM_data1, if it can be used, is always more compact than (or as compact as) DW_FORM_udata. While the choice in between DW_FORM_data2 and DW_FORM_udata if we have 256 to 65535 directories needs to be decided by computing both the size it would take for all the uleb128s and compare that to the size it takes to encode all as data2. > >+dw2_asm_output_data (idx_form == DW_FORM_data1 ? 1 : 2, > >+ 0, NULL); > > It seems that we could use an output_data function that takes a DW_FORM as > one of its arguments and does the appropriate thing. Had a brief look at this, but it seems hard without big changes in dwarf2asm.[ch] or lots of duplication. The problem is that dw2_asm_output_data* are varargs functions, so if we add some wrapper that picks what to call based on a enum dwarf_form argument, we'd need to convert all of them to take va_list instead, and add various further wrappers. Is this ok to leave this part out? > >+ if (dwarf_version >= 5 && str_form == DW_FORM_line_strp) > >+{ > >+ node = find_AT_string_in_table (filename0, debug_line_str_hash); > >+ set_indirect_string (node); > >+ node->form = DW_FORM_line_strp; > >+ dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, > >+ debug_line_str_section, > >+ "File Entry: %#x: \"%s\"", 0, node->str); > >+} > >+ else > >+dw2_asm_output_nstring (filename0, -1, "File Entry: %#x", 0); > > Please factor this out into a separate function. So like static void output_line_string (enum dwarf_form form, const char *str, const char *entry_kind, unsigned int idx) in the patch below? > >+ if (!DWARF2_ASM_LINE_DEBUG_INFO > >+ && dwarf_version >= 5 > >+ && !(DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET > >+ || (DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) == 0 > >+ /* FIXME: For now. */ > >+ || dwarf_split_debug_info)) > > This should also be a separate function or macro, since it's used more than > once. And please elaborate the FIXME comment. Like DWARF5_USE_DEBUG_LINE_STR macro in the patch below? For -gsplit-dwarf, the problem is I'm really not familiar with what tools are used post assembly for this (and likely the tools need adjustment for DWARF5 anyway), so what is in the patches is just partial support what appeared to be easily done by myself. For the rest I've been hoping Cary or somebody else could finish it up. Jakub 2016-10-31 Jakub Jelinek* dwarf2out.c (debug_line_str_section): New variable. (debug_line_str_hash): Likewise. (DEBUG_LINE_STR_SECTION): Define. (set_indirect_string): Handle DW_FORM_line_strp like DW_FORM_strp. (find_string_form): Fix up formatting. (size_of_die): Handle DW_FORM_line_strp like DW_FORM_strp. Fix up indentation. (output_die): Handle DW_FORM_line_strp. (DWARF5_USE_DEBUG_LINE_STR): Define. (output_line_string): New function. (output_file_names): Add -gdwarf-5 support. (output_line_info): Likewise. (init_sections_and_labels): Initialize debug_line_str_section. (output_indirect_string): Change 2nd argument from void * to enum dwarf_form form, compare with form rather than DW_FORM_strp. (output_indirect_strings): Pass DW_FORM_strp to output_indirect_string traversion. (dwarf2out_finish):
[RFC][PATCH][AArch64] Cleanup frame pointer usage
This patch cleans up all code related to the frame pointer. On AArch64 we emit a frame chain even in cases where the frame pointer is not required. So make this explicit by introducing a boolean emit_frame_chain in aarch64_frame record. When the frame pointer is enabled but not strictly required (eg. no use of alloca), we emit a frame chain in non-leaf functions, but continue to use the stack pointer to access locals. This results in smaller code and unwind info. Also simplify the complex logic in aarch64_override_options_after_change_1 and compute whether the frame chain is required in aarch64_layout_frame instead. As a result aarch64_frame_pointer_required is now redundant and aarch64_can_eliminate can be greatly simplified. Finally convert all callee save/restore functions to use gen_frame_mem. Bootstrap OK. Any comments? ChangeLog: 2016-10-31 Wilco Dijkstragcc/ * config/aarch64/aarch64.h (aarch64_frame): Add emit_frame_chain boolean. * config/aarch64/aarch64.c (aarch64_frame_pointer_required) Remove. (aarch64_layout_frame): Initialise emit_frame_chain. (aarch64_pushwb_single_reg): Use gen_frame_mem. (aarch64_pop_regs): Likewise. (aarch64_gen_load_pair): Likewise. (aarch64_save_callee_saves): Likewise. (aarch64_restore_callee_saves): Likewise. (aarch64_expand_prologue): Use emit_frame_chain. (aarch64_can_eliminate): Simplify. When FP needed or outgoing arguments are large, eliminate to FP, otherwise SP. (aarch64_override_options_after_change_1): Simplify. (TARGET_FRAME_POINTER_REQUIRED): Remove define. -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index fa81e4b853daf08842955288861ec7e7acca..6e32dc9f6f171dde0c182fdd7857230251f71712 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -583,6 +583,9 @@ struct GTY (()) aarch64_frame /* The size of the stack adjustment after saving callee-saves. */ HOST_WIDE_INT final_adjust; + /* Store FP,LR and setup a frame pointer. */ + bool emit_frame_chain; + unsigned wb_candidate1; unsigned wb_candidate2; diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f07d771ea343803e054e03f59c8c1efb698bf474..6c06ac18d16f8afa7ee1cc5e8530e285a60e2b0f 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2728,24 +2728,6 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2) return ""; } -static bool -aarch64_frame_pointer_required (void) -{ - /* In aarch64_override_options_after_change - flag_omit_leaf_frame_pointer turns off the frame pointer by - default. Turn it back on now if we've not got a leaf - function. */ - if (flag_omit_leaf_frame_pointer - && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM))) -return true; - - /* Force a frame pointer for EH returns so the return address is at FP+8. */ - if (crtl->calls_eh_return) -return true; - - return false; -} - /* Mark the registers that need to be saved by the callee and calculate the size of the callee-saved registers area and frame record (both FP and LR may be omitted). */ @@ -2758,6 +2740,18 @@ aarch64_layout_frame (void) if (reload_completed && cfun->machine->frame.laid_out) return; + /* Force a frame chain for EH returns so the return address is at FP+8. */ + cfun->machine->frame.emit_frame_chain += frame_pointer_needed || crtl->calls_eh_return; + + /* Emit a frame chain if the frame pointer is enabled. + If -momit-leaf-frame-pointer is used, do not use a frame chain + in leaf functions which do not use LR. */ + if (flag_omit_frame_pointer == 2 + && !(flag_omit_leaf_frame_pointer && crtl->is_leaf + && !df_regs_ever_live_p (LR_REGNUM))) +cfun->machine->frame.emit_frame_chain = true; + #define SLOT_NOT_REQUIRED (-2) #define SLOT_REQUIRED (-1) @@ -2789,7 +2783,7 @@ aarch64_layout_frame (void) && !call_used_regs[regno]) cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED; - if (frame_pointer_needed) + if (cfun->machine->frame.emit_frame_chain) { /* FP and LR are placed in the linkage record. */ cfun->machine->frame.reg_offset[R29_REGNUM] = 0; @@ -2937,7 +2931,7 @@ aarch64_pushwb_single_reg (machine_mode mode, unsigned regno, reg = gen_rtx_REG (mode, regno); mem = gen_rtx_PRE_MODIFY (Pmode, base_rtx, plus_constant (Pmode, base_rtx, -adjustment)); - mem = gen_rtx_MEM (mode, mem); + mem = gen_frame_mem (mode, mem); insn = emit_move_insn (mem, reg); RTX_FRAME_RELATED_P (insn) = 1; @@ -3011,7 +3005,7 @@ aarch64_pop_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment, { rtx mem = plus_constant (Pmode, stack_pointer_rtx, adjustment); mem = gen_rtx_POST_MODIFY (Pmode, stack_pointer_rtx, mem); - emit_move_insn (reg1,
A new branch 'ira-select' created
I've created a new branch ira-select for work on experimental algorithm of calculations of pseudo register classes. The current algorithm in IRA (file ira-costs.c) was adopted from old regclass.c. It has a big disadvantage which is in ignoring a fact that operands should be in the same alternative when we calculate reg class costs for an insn pseudos. The new algorithm is based on choosing insn alternative and then calculation of pseudo reg class costs knowing the alternative. As the current algorithm the new one is a two pass algorithm. On the second pass we use pseudo reg classes chosen on the first pass to choose again an insn alternative better and calculate reg class costs more accurately. A special treatment of pseudo move insns is done as they have too many possible alternatives. In some way the new algorithm can be considered as a more explicit early code selection than the current algorithm is (the final code selection in GCC is actually done in LRA). On the branch the new algorithm is switched on by default. To switch it off, use option -fno-ira-select. Currently, the new algorithm improves x86-64 SPECINT2000 by 0.5% and speed up the compiler on it by 1%. Still there are some problems with code. They are a bigger code for SPECFP2000 and some failures on GCC testsuite (unexpected code generation). This code is experimental, the work is not finished yet (I publish this code as I promised to show it to some people on last Cauldron). The code might become a part of GCC8 if it works well for variety of targets.
Re: [PATCH] dwarf2cfi: Dump row differences before asserting
On 31/10/16 17:15, Segher Boessenkool wrote: If maybe_record_trace_start fails because the CFI is inconsistent on two paths into a block it currently just ICEs. This changes it to also dump the CFI on those two paths in the dump file; debugging it without that information is hopeless. Tested on powerpc64-linux {-m32,-m64}. Is this okay for trunk? +1 from me as a variant of this patch has helped me debug issues with the CFI. Kyrill Segher 2016-10-31 Segher Boessenkool* dwarf2cfi.c (dump_cfi_row): Add forward declaration. (maybe_record_trace_start): If the CFI is different on the new and old paths, print out both to the dump file before ICEing. --- gcc/dwarf2cfi.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c index da9da52..19edc28 100644 --- a/gcc/dwarf2cfi.c +++ b/gcc/dwarf2cfi.c @@ -2239,6 +2239,8 @@ add_cfis_to_fde (void) } } +static void dump_cfi_row (FILE *f, dw_cfi_row *row); + /* If LABEL is the start of a trace, then initialize the state of that trace from CUR_TRACE and CUR_ROW. */ @@ -2282,7 +2284,21 @@ maybe_record_trace_start (rtx_insn *start, rtx_insn *origin) /* We ought to have the same state incoming to a given trace no matter how we arrive at the trace. Anything else means we've got some kind of optimization error. */ - gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row)); +#if CHECKING_P + if (!cfi_row_equal_p (cur_row, ti->beg_row)) + { + if (dump_file) + { + fprintf (dump_file, "Inconsistent CFI state!\n"); + fprintf (dump_file, "SHOULD have:\n"); + dump_cfi_row (dump_file, ti->beg_row); + fprintf (dump_file, "DO have:\n"); + dump_cfi_row (dump_file, cur_row); + } + + gcc_unreachable (); + } +#endif /* The args_size is allowed to conflict if it isn't actually used. */ if (ti->beg_true_args_size != args_size)
[PATCH] Fall-through warnings in combine.c
I think this is obvious, the following code in combine.c which intends to fallthrough some switch cases is not marked with 'gcc_fallthrough' as it should be, thus producing warnings during compilation. --- Fritz Reese 2016-10-31 Fritz Reesegcc/combine.c (simplify_compare_const): Mark fallthrough cases. diff --git a/gcc/combine.c b/gcc/combine.c index 60a127f..6cbb8cf 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -11414,6 +11414,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode, const_op -= 1; code = LE; /* ... fall through to LE case below. */ + gcc_fallthrough (); } else break; @@ -11443,6 +11444,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode, const_op -= 1; code = GT; /* ... fall through to GT below. */ + gcc_fallthrough (); } else break;
[PATCH] dwarf2cfi: Dump row differences before asserting
If maybe_record_trace_start fails because the CFI is inconsistent on two paths into a block it currently just ICEs. This changes it to also dump the CFI on those two paths in the dump file; debugging it without that information is hopeless. Tested on powerpc64-linux {-m32,-m64}. Is this okay for trunk? Segher 2016-10-31 Segher Boessenkool* dwarf2cfi.c (dump_cfi_row): Add forward declaration. (maybe_record_trace_start): If the CFI is different on the new and old paths, print out both to the dump file before ICEing. --- gcc/dwarf2cfi.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c index da9da52..19edc28 100644 --- a/gcc/dwarf2cfi.c +++ b/gcc/dwarf2cfi.c @@ -2239,6 +2239,8 @@ add_cfis_to_fde (void) } } +static void dump_cfi_row (FILE *f, dw_cfi_row *row); + /* If LABEL is the start of a trace, then initialize the state of that trace from CUR_TRACE and CUR_ROW. */ @@ -2282,7 +2284,21 @@ maybe_record_trace_start (rtx_insn *start, rtx_insn *origin) /* We ought to have the same state incoming to a given trace no matter how we arrive at the trace. Anything else means we've got some kind of optimization error. */ - gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row)); +#if CHECKING_P + if (!cfi_row_equal_p (cur_row, ti->beg_row)) + { + if (dump_file) + { + fprintf (dump_file, "Inconsistent CFI state!\n"); + fprintf (dump_file, "SHOULD have:\n"); + dump_cfi_row (dump_file, ti->beg_row); + fprintf (dump_file, "DO have:\n"); + dump_cfi_row (dump_file, cur_row); + } + + gcc_unreachable (); + } +#endif /* The args_size is allowed to conflict if it isn't actually used. */ if (ti->beg_true_args_size != args_size) -- 1.9.3
Re: [PATCH] For -gdwarf-5 emit DW_OP_{implicit_pointer,entry_value,*_type,convert,reinterpret}
On 10/31/2016 01:04 PM, Jakub Jelinek wrote: On Mon, Oct 31, 2016 at 12:58:23PM -0400, Jason Merrill wrote: On 10/14/2016 03:41 PM, Jakub Jelinek wrote: Another set of GNU extensions that were accepted into DWARF5, so we should emit them even for -gstrict-dwarf -gdwarf-5, and for -gdwarf-5 should use the accepted standard opcodes instead of the corresponding GNU ones. Hmm, I wonder if we also want to use the standard opcodes instead of the GNU ones in -gdwarf-4 and below. Maybe in 2 years, but not now IMHO. The GNU opcodes are supported by lots of consumers for often as long as 5 years, while the DWARF5 ones might be supported next year. So, emitting them now by default would mean e.g. most of the guality tests would fail until one updates to the not yet existing GDB. Similar reason why we IMHO shouldn't enable -gdwarf-5 by default in GCC 7, but should wait for GCC 8+ with that. Makes sense. The patch is OK. Jason
Re: [PATCH] For -gdwarf-5 emit DW_OP_{implicit_pointer,entry_value,*_type,convert,reinterpret}
On Mon, Oct 31, 2016 at 12:58:23PM -0400, Jason Merrill wrote: > On 10/14/2016 03:41 PM, Jakub Jelinek wrote: > >Another set of GNU extensions that were accepted into DWARF5, so we should > >emit them even for -gstrict-dwarf -gdwarf-5, and for -gdwarf-5 should use > >the accepted standard opcodes instead of the corresponding GNU ones. > > Hmm, I wonder if we also want to use the standard opcodes instead of the GNU > ones in -gdwarf-4 and below. Maybe in 2 years, but not now IMHO. The GNU opcodes are supported by lots of consumers for often as long as 5 years, while the DWARF5 ones might be supported next year. So, emitting them now by default would mean e.g. most of the guality tests would fail until one updates to the not yet existing GDB. Similar reason why we IMHO shouldn't enable -gdwarf-5 by default in GCC 7, but should wait for GCC 8+ with that. Jakub
Re: [C++ PATCH] Fix postfix-expression parsing (PR c++/78089, take 2)
OK.
Re: [PATCH] For -gdwarf-5 emit DW_OP_{implicit_pointer,entry_value,*_type,convert,reinterpret}
On 10/14/2016 03:41 PM, Jakub Jelinek wrote: Another set of GNU extensions that were accepted into DWARF5, so we should emit them even for -gstrict-dwarf -gdwarf-5, and for -gdwarf-5 should use the accepted standard opcodes instead of the corresponding GNU ones. Hmm, I wonder if we also want to use the standard opcodes instead of the GNU ones in -gdwarf-4 and below. Jason
Re: [PATCH] DWARF5 .debug_loclists support
All the different cases in output_loc_list could use brief comments. Let's keep loc_list_idx, output_loclists_offsets, and assign_location_list_indexes together in the file. OK with those changes. Jason
Re: [PATCH] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly
On 10/31/2016 09:45 AM, Andreas Schwab wrote: > On Okt 31 2016, Vineet Guptawrote: > >> This silences tons of -Wformat= warnings when building ARC Linux kernel >> with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had >> similar fix. >> >> gcc/ >> 2016-10-28 Vineet Gupta >> >> * config/arc/arc.h (SIZE_TYPE): define as unsigned int. >> * (PTRDIFF_TYPE): define as int. >> >> Signed-off-by: Vineet Gupta >> --- >> gcc/config/arc/arc.h | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h >> index cf13a7fa9fc3..0b579c44ee12 100644 >> --- a/gcc/config/arc/arc.h >> +++ b/gcc/config/arc/arc.h >> @@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT \ >> #define DEFAULT_SIGNED_CHAR 0 >> >> #undef SIZE_TYPE >> -#define SIZE_TYPE "long unsigned int" >> +#define SIZE_TYPE "unsigned int" >> >> #undef PTRDIFF_TYPE >> -#define PTRDIFF_TYPE "long int" >> +#define PTRDIFF_TYPE "int" >> >> #undef WCHAR_TYPE >> #define WCHAR_TYPE "int" > This changes the C++ ABI. You need to fix the kernel instead, see > arch/avr32/include/uapi/asm/posix_types.h for how to do that. There's ABI change one way or the other. I'd rather fix the upstream to have the ABI we want to continue with in the long run and match with the ABI we have in our off-upstream 4.8 and 6.x versions ! Thx, -Vineet
[PATCH-Resend] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly
This silences tons of -Wformat= warnings when building ARC Linux kernel with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had similar fix. gcc/ 2016-10-28 Vineet Gupta* config/arc/arc.h (SIZE_TYPE): define as unsigned int. * (PTRDIFF_TYPE): define as int. Signed-off-by: Vineet Gupta --- Fat fingered Andrew's address in prev posting --- gcc/config/arc/arc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index cf13a7fa9fc3..0b579c44ee12 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT \ #define DEFAULT_SIGNED_CHAR 0 #undef SIZE_TYPE -#define SIZE_TYPE "long unsigned int" +#define SIZE_TYPE "unsigned int" #undef PTRDIFF_TYPE -#define PTRDIFF_TYPE "long int" +#define PTRDIFF_TYPE "int" #undef WCHAR_TYPE #define WCHAR_TYPE "int" -- 2.7.4
Re: [C++ PATCH] Add tests for a const member and a reference member for launder.
OK, thanks. On Mon, Oct 31, 2016 at 12:13 PM, Ville Voutilainenwrote: > On 31 October 2016 at 15:31, Jason Merrill wrote: >> Good idea. You might put the reuse in a separate function in order to >> hide it from the optimizer. > > > Ok, new patch, tested on Linux-x64, ok for trunk? > > 2016-10-31 Ville Voutilainen > > Add tests for a const member and a reference member for launder. > * g++.dg/cpp1z/launder3.C: New. > * g++.dg/cpp1z/launder4.C: Likewise. > * g++.dg/cpp1z/launder5.C: Likewise. > * g++.dg/cpp1z/launder5.cc: Likewise. > * g++.dg/cpp1z/launder5.h: Likewise. > * g++.dg/cpp1z/launder6.C: Likewise. > * g++.dg/cpp1z/launder6.cc: Likewise. > * g++.dg/cpp1z/launder6.h: Likewise.
Re: [PATCH] DWARF5 DW_FORM_data16 support
On Mon, Oct 31, 2016 at 12:44 PM, Jakub Jelinekwrote: > On Mon, Oct 31, 2016 at 10:15:11AM -0400, Jason Merrill wrote: >> On 10/17/2016 05:39 PM, Jakub Jelinek wrote: >> >(dwarf_version >= 5 ? 128 : 64) >> >> Please make this a macro. OK with that change. > > DWARF_LARGEST_DATA_FORM_BITS ? Or do you have better name? Sounds good. Jason
Re: [PATCH] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly
On Okt 31 2016, Vineet Guptawrote: > This silences tons of -Wformat= warnings when building ARC Linux kernel > with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had > similar fix. > > gcc/ > 2016-10-28 Vineet Gupta > > * config/arc/arc.h (SIZE_TYPE): define as unsigned int. > * (PTRDIFF_TYPE): define as int. > > Signed-off-by: Vineet Gupta > --- > gcc/config/arc/arc.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h > index cf13a7fa9fc3..0b579c44ee12 100644 > --- a/gcc/config/arc/arc.h > +++ b/gcc/config/arc/arc.h > @@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT \ > #define DEFAULT_SIGNED_CHAR 0 > > #undef SIZE_TYPE > -#define SIZE_TYPE "long unsigned int" > +#define SIZE_TYPE "unsigned int" > > #undef PTRDIFF_TYPE > -#define PTRDIFF_TYPE "long int" > +#define PTRDIFF_TYPE "int" > > #undef WCHAR_TYPE > #define WCHAR_TYPE "int" This changes the C++ ABI. You need to fix the kernel instead, see arch/avr32/include/uapi/asm/posix_types.h for how to do that. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [PATCH] DWARF5 DW_TAG_call_site support
On Mon, Oct 31, 2016 at 10:13:16AM -0400, Jason Merrill wrote: > OK. This patch is on top of the - DWARF5 http://gcc.gnu.org/ml/gcc-patches/2016-10/msg01202.html - DW_OP_{implicit_pointer,entry_value,{const,regval,deref}_type} and DW_OP_{convert,reinterpret} -gdwarf-5 support patch and depends on the functions added by that patch. Jakub
Re: [PATCH] enhance buffer overflow warnings (and c/53562)
On 10/31/2016 06:39 AM, Tobias Burnus wrote: Martin Sebor wrote: Attached is an updated patch that adds checks for excessive sizes and bounds (those in excess of SIZE_MAX / 2), and also enables the same checking for strcat and strncat). This version also fixes an issue with the interpretation of anti-ranges in the first patch. The improvements exposed two bugs in the regression tests. If I apply this patch to my local trunk - and try to bootstrap GCC, bootstrapping fails (on x86-64_gnu-linux) as following. I have not tried to figure out whether the warning (-Werror) makes sense or not. ../../gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’: ../../gcc/emit-rtl.c:3933:59: error: void* memset(void*, int, size_t) writing 8 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow] memset (_DATA (note), 0, sizeof (NOTE_DATA (note))); where NOTE_DATA is defined in rtl.h as /* Opaque data. */ #define NOTE_DATA(INSN) RTL_CHECKC1 (INSN, 3, NOTE) Thanks for trying it out! The patch bootstrapped and passed regression tests for me yesterday, also on x86_64, and today on powepc64le, but just now I reproduced the error with the top of today's trunk on x86_64. I think the error is justified because the call to memset in the make_note_raw function references a fourth element (rtx_note::rtx_insn::rtx_def::u.fld[3]) of what is just a single element array. After macro expansion, the memset call itself: memset (&((note)->u.fld[3]), 0, sizeof (((note)->u.fld[3]))); is within the bounds of the object pointed to by note but the index 3 is out of bounds for note->u.fld and so undefined. An unpatched GCC issues a similar error when the call to memset is replaced with __builtin___memset_chk like so: __builtin___memset_chk (&((note)->u.fld[3]), 0, sizeof (((note)->u.fld[3])), __builtin_object_size (&((note)->u.fld[3]), 1)); /src/gcc/53562/gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’: /src/gcc/53562/gcc/emit-rtl.c:3941:53: warning: call to void* __builtin___memset_chk(void*, int, long unsigned int, long unsigned int) will always overflow destination buffer The code can be made valid and the warning avoided by deriving the same address not from an invalid pointer/subscript but rather from a pointer to the beginning of the union itself and adding the offset of the fourth element, like so: char *p = (char*) &(note)->u.fld[0]; p += sizeof (((note)->u.fld[0])) * 3; memset (p, 0, sizeof *p); Unfortunately, because the invalid subscript is the result of the expansion of the RTL_CHECK1() macro fixing this isn't as straightforward as that. What really should be fixed is the macro itself. Until then, it can be worked (hacked would be a better term) around it by storing the result of _DATA (note) expression in a volatile pointer, like so: diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 8afcfbe..6dd9439 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -3930,7 +3930,8 @@ make_note_raw (enum insn_note subtype) INSN_UID (note) = cur_insn_uid++; NOTE_KIND (note) = subtype; BLOCK_FOR_INSN (note) = NULL; - memset (_DATA (note), 0, sizeof (NOTE_DATA (note))); + void* volatile p = _DATA (note); + memset (p, 0, sizeof (NOTE_DATA (note))); return note; } Martin
Re: [PATCH] DWARF5 DW_FORM_data16 support
On Mon, Oct 31, 2016 at 10:15:11AM -0400, Jason Merrill wrote: > On 10/17/2016 05:39 PM, Jakub Jelinek wrote: > >(dwarf_version >= 5 ? 128 : 64) > > Please make this a macro. OK with that change. DWARF_LARGEST_DATA_FORM_BITS ? Or do you have better name? Jakub
[PATCH] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly
This silences tons of -Wformat= warnings when building ARC Linux kernel with gcc 6.x (and restores the ARC gcc 4.8.x behaviour) which had similar fix. gcc/ 2016-10-28 Vineet Gupta* config/arc/arc.h (SIZE_TYPE): define as unsigned int. * (PTRDIFF_TYPE): define as int. Signed-off-by: Vineet Gupta --- gcc/config/arc/arc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index cf13a7fa9fc3..0b579c44ee12 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -282,10 +282,10 @@ if (GET_MODE_CLASS (MODE) == MODE_INT \ #define DEFAULT_SIGNED_CHAR 0 #undef SIZE_TYPE -#define SIZE_TYPE "long unsigned int" +#define SIZE_TYPE "unsigned int" #undef PTRDIFF_TYPE -#define PTRDIFF_TYPE "long int" +#define PTRDIFF_TYPE "int" #undef WCHAR_TYPE #define WCHAR_TYPE "int" -- 2.7.4
Re: [PATCH] Fix reassoc NEGATE_EXPR handling (PR tree-optimization/77860)
On October 31, 2016 4:50:08 PM GMT+01:00, Jakub Jelinekwrote: >Hi! > >As the testcase shows, while reassociation of vector multiplication >has been enabled, the eliminate_using_constants optimization >isn't enabled for integral vectors (only for floating point vectors), >so e.g. if we have more than one { ~0, ~0, ... } vectors in there >coming from NEGATE_EXPRs, we don't actually fold them into a single >{ ~0, ~0, ... } or { 1, 1, ... } vector (the former then into >NEGATE_EXPR on the result, the latter removed altogether). > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Richard. >2016-10-31 Jakub Jelinek > > PR tree-optimization/77860 > * tree-ssa-reassoc.c (eliminate_using_constants): Handle > also integral complex and vector constants. > > * gcc.dg/pr77860.c: New test. > >--- gcc/tree-ssa-reassoc.c.jj 2016-10-31 13:28:12.0 +0100 >+++ gcc/tree-ssa-reassoc.c 2016-10-31 13:47:00.364216248 +0100 >@@ -924,7 +924,7 @@ eliminate_using_constants (enum tree_cod > tree type = TREE_TYPE (oelast->op); > > if (oelast->rank == 0 >- && (INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type))) >+ && (ANY_INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type))) > { > switch (opcode) > { >--- gcc/testsuite/gcc.dg/pr77860.c.jj 2016-10-31 13:50:34.019649814 >+0100 >+++ gcc/testsuite/gcc.dg/pr77860.c 2016-10-31 13:49:35.0 +0100 >@@ -0,0 +1,13 @@ >+/* PR tree-optimization/77860 */ >+/* { dg-do compile } */ >+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-forwprop -Wno-psabi" } */ >+ >+typedef unsigned short V __attribute__((vector_size (16))); >+ >+V >+foo (V x, V y) >+{ >+ V a = -x; >+ V b = -y; >+ return a * b; >+} > > Jakub
Re: [C++ PATCH] Add tests for a const member and a reference member for launder.
On 31 October 2016 at 15:31, Jason Merrillwrote: > Good idea. You might put the reuse in a separate function in order to > hide it from the optimizer. Ok, new patch, tested on Linux-x64, ok for trunk? 2016-10-31 Ville Voutilainen Add tests for a const member and a reference member for launder. * g++.dg/cpp1z/launder3.C: New. * g++.dg/cpp1z/launder4.C: Likewise. * g++.dg/cpp1z/launder5.C: Likewise. * g++.dg/cpp1z/launder5.cc: Likewise. * g++.dg/cpp1z/launder5.h: Likewise. * g++.dg/cpp1z/launder6.C: Likewise. * g++.dg/cpp1z/launder6.cc: Likewise. * g++.dg/cpp1z/launder6.h: Likewise. diff --git a/gcc/testsuite/g++.dg/cpp1z/launder3.C b/gcc/testsuite/g++.dg/cpp1z/launder3.C new file mode 100644 index 000..2a2afc5 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder3.C @@ -0,0 +1,38 @@ +// { dg-do run { target c++11 } } +// { dg-additional-options "-O2" } + +#include + +void * +operator new (decltype (sizeof (0)), void *p) +{ + return p; +} + +namespace std +{ + template + T * + launder (T *p) + { +return __builtin_launder (p); + } +} + +struct A +{ + const int x; +}; + +struct B +{ + A a; +}; + +int +main () +{ + B b{{42}}; + new () A{666}; + assert(std::launder()->x == 666); +} diff --git a/gcc/testsuite/g++.dg/cpp1z/launder4.C b/gcc/testsuite/g++.dg/cpp1z/launder4.C new file mode 100644 index 000..3a65eb2 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder4.C @@ -0,0 +1,40 @@ +// { dg-do run { target c++11 } } +// { dg-additional-options "-O2" } + +#include + +void * +operator new (decltype (sizeof (0)), void *p) +{ + return p; +} + +namespace std +{ + template + T * + launder (T *p) + { +return __builtin_launder (p); + } +} + +struct A +{ + int& x; +}; + +struct B +{ + A a; +}; + +int +main () +{ + int x = 42; + B b{{x}}; + int y = 666; + new () A{y}; + assert(std::launder()->x == 666); +} diff --git a/gcc/testsuite/g++.dg/cpp1z/launder5.C b/gcc/testsuite/g++.dg/cpp1z/launder5.C new file mode 100644 index 000..483d6f2 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder5.C @@ -0,0 +1,25 @@ +// { dg-do run { target c++11 } } +// { dg-additional-options "-O2" } +// { dg-additional-sources "launder5.cc" } + +#include +#include "launder5.h" + +namespace std +{ + template + T * + launder (T *p) + { +return __builtin_launder (p); + } +} + + +int +main () +{ + B b{{42}}; + f(b); + assert(std::launder()->x == 666); +} diff --git a/gcc/testsuite/g++.dg/cpp1z/launder5.cc b/gcc/testsuite/g++.dg/cpp1z/launder5.cc new file mode 100644 index 000..f9d867d --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder5.cc @@ -0,0 +1,12 @@ +#include "launder5.h" + +void * +operator new (decltype (sizeof (0)), void *p) +{ + return p; +} + +void f(B& b) +{ + new () A{666}; +} diff --git a/gcc/testsuite/g++.dg/cpp1z/launder5.h b/gcc/testsuite/g++.dg/cpp1z/launder5.h new file mode 100644 index 000..b66103a --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder5.h @@ -0,0 +1,16 @@ +#ifndef GCC_TEST_LAUNDER5_H +#define GCC_TEST_LAUNDER5_H + +struct A +{ + const int x; +}; + +struct B +{ + A a; +}; + +void f(B& b); + +#endif diff --git a/gcc/testsuite/g++.dg/cpp1z/launder6.C b/gcc/testsuite/g++.dg/cpp1z/launder6.C new file mode 100644 index 000..babc4b4 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder6.C @@ -0,0 +1,24 @@ +// { dg-do run { target c++11 } } +// { dg-additional-options "-O2" } +// { dg-additional-sources "launder6.cc" } +#include +#include "launder6.h" + +namespace std +{ + template + T * + launder (T *p) + { +return __builtin_launder (p); + } +} + +int +main () +{ + int x = 42; + B b{{x}}; + f(b); + assert(std::launder()->x == 666); +} diff --git a/gcc/testsuite/g++.dg/cpp1z/launder6.cc b/gcc/testsuite/g++.dg/cpp1z/launder6.cc new file mode 100644 index 000..1822891 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder6.cc @@ -0,0 +1,14 @@ +#include "launder6.h" + +void * +operator new (decltype (sizeof (0)), void *p) +{ + return p; +} + +int y = 666; + +void f(B& b) +{ + new () A{y}; +} diff --git a/gcc/testsuite/g++.dg/cpp1z/launder6.h b/gcc/testsuite/g++.dg/cpp1z/launder6.h new file mode 100644 index 000..eca2ad4 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/launder6.h @@ -0,0 +1,16 @@ +#ifndef GCC_TEST_LAUNDER6_H +#define GCC_TEST_LAUNDER6_H + +struct A +{ + int& x; +}; + +struct B +{ + A a; +}; + +void f(B& b); + +#endif
[PATCH] Fix reassoc NEGATE_EXPR handling (PR tree-optimization/77860)
Hi! As the testcase shows, while reassociation of vector multiplication has been enabled, the eliminate_using_constants optimization isn't enabled for integral vectors (only for floating point vectors), so e.g. if we have more than one { ~0, ~0, ... } vectors in there coming from NEGATE_EXPRs, we don't actually fold them into a single { ~0, ~0, ... } or { 1, 1, ... } vector (the former then into NEGATE_EXPR on the result, the latter removed altogether). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-10-31 Jakub JelinekPR tree-optimization/77860 * tree-ssa-reassoc.c (eliminate_using_constants): Handle also integral complex and vector constants. * gcc.dg/pr77860.c: New test. --- gcc/tree-ssa-reassoc.c.jj 2016-10-31 13:28:12.0 +0100 +++ gcc/tree-ssa-reassoc.c 2016-10-31 13:47:00.364216248 +0100 @@ -924,7 +924,7 @@ eliminate_using_constants (enum tree_cod tree type = TREE_TYPE (oelast->op); if (oelast->rank == 0 - && (INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type))) + && (ANY_INTEGRAL_TYPE_P (type) || FLOAT_TYPE_P (type))) { switch (opcode) { --- gcc/testsuite/gcc.dg/pr77860.c.jj 2016-10-31 13:50:34.019649814 +0100 +++ gcc/testsuite/gcc.dg/pr77860.c 2016-10-31 13:49:35.0 +0100 @@ -0,0 +1,13 @@ +/* PR tree-optimization/77860 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-vrp -fno-tree-forwprop -Wno-psabi" } */ + +typedef unsigned short V __attribute__((vector_size (16))); + +V +foo (V x, V y) +{ + V a = -x; + V b = -y; + return a * b; +} Jakub
[PATCH] [ARC] New option handling, refurbish multilib support.
Hi Andrew, Please find the updated patch. What is new: - The .def files are having a comment block on how to add new lines. - The arc_seen_option is not used. - The arc_cpu* variables are not used. Please let me know if I miss something, Claudiu gcc/ 2016-05-09 Claudiu Zissulescu* config/arc/arc-arch.h: New file. * config/arc/arc-arches.def: Likewise. * config/arc/arc-cpus.def: Likewise. * config/arc/arc-options.def: Likewise. * config/arc/t-multilib: Likewise. * config/arc/genmultilib.awk: Likewise. * config/arc/genoptions.awk: Likewise. * config/arc/arc-tables.opt: Likewise. * config/arc/driver-arc.c: Likewise. * common/config/arc/arc-common.c (arc_handle_option): Trace toggled options. * config.gcc (arc*-*-*): Add arc-tables.opt to arc's extra options; check for supported cpu against arc-cpus.def file. (arc*-*-elf*, arc*-*-linux-uclibc*): Use new make fragment; define TARGET_CPU_BUILD macro; add driver-arc.o as an extra object. * config/arc/arc-c.def: Add emacs local variables. * config/arc/arc-opts.h (processor_type): Use arc-cpus.def file. (FPU_FPUS, FPU_FPUD, FPU_FPUDA, FPU_FPUDA_DIV, FPU_FPUDA_FMA) (FPU_FPUDA_ALL, FPU_FPUS_DIV, FPU_FPUS_FMA, FPU_FPUS_ALL) (FPU_FPUD_DIV, FPU_FPUD_FMA, FPU_FPUD_ALL): New defines. (DEFAULT_arc_fpu_build): Define. (DEFAULT_arc_mpy_option): Define. * config/arc/arc-protos.h (arc_init): Delete. * config/arc/arc.c (arc_cpu_name): New variable. (arc_selected_cpu, arc_selected_arch, arc_arcem, arc_archs) (arc_arc700, arc_arc600, arc_arc601): New variable. (arc_init): Add static; remove selection of default tune value, cleanup obsolete error messages. (arc_override_options): Make use of .def files for selecting the right cpu and option configurations. * config/arc/arc.h (stdbool.h): Include. (TARGET_CPU_DEFAULT): Define. (CPP_SPEC): Remove mcpu=NPS400 handling. (arc_cpu_to_as): Declare. (EXTRA_SPEC_FUNCTIONS): Define. (OPTION_DEFAULT_SPECS): Likewise. (ASM_DEFAULT): Remove. (ASM_SPEC): Use arc_cpu_to_as. (DRIVER_SELF_SPECS): Remove deprecated options. (arc_base_cpu): Declare. (TARGET_ARC600, TARGET_ARC601, TARGET_ARC700, TARGET_EM) (TARGET_HS, TARGET_V2, TARGET_ARC600): Make them use arc_base_cpu variable. (MULTILIB_DEFAULTS): Use ARC_MULTILIB_CPU_DEFAULT. * config/arc/arc.md (attr_cpu): Remove. * config/arc/arc.opt (arc_mpy_option): Make it target variable. (mno-mpy): Deprecate. (mcpu=ARC600, mcpu=ARC601, mcpu=ARC700, mcpu=NPS400, mcpu=ARCEM) (mcpu=ARCHS): Remove. (mcrc, mdsp-packa, mdvbf, mmac-d16, mmac-24, mtelephony, mrtsc): Deprecate. (mbarrel_shifte, mspfp_, mdpfp_, mdsp_pack, mmac_): Remove. (arc_fpu): Use new defines. (arc_seen_options): New target variable. * config/arc/t-arc (driver-arc.o): New target. (arc-cpus, t-multilib, arc-tables.opt): Likewise. * config/arc/t-arc-newlib: Delete. * config/arc/t-arc-uClibc: Renamed to t-uClibc. * doc/invoke.texi (ARC): Update arc options. --- gcc/common/config/arc/arc-common.c | 102 ++- gcc/config.gcc | 47 + gcc/config/arc/arc-arch.h | 120 ++ gcc/config/arc/arc-arches.def | 56 ++ gcc/config/arc/arc-c.def | 4 + gcc/config/arc/arc-cpus.def| 75 ++ gcc/config/arc/arc-options.def | 109 gcc/config/arc/arc-opts.h | 47 +++-- gcc/config/arc/arc-protos.h| 1 - gcc/config/arc/arc-tables.opt | 90 gcc/config/arc/arc.c | 176 +--- gcc/config/arc/arc.h | 89 gcc/config/arc/arc.md | 5 - gcc/config/arc/arc.opt | 106 +-- gcc/config/arc/driver-arc.c| 78 ++ gcc/config/arc/genmultilib.awk | 203 + gcc/config/arc/genoptions.awk | 86 gcc/config/arc/t-arc | 19 gcc/config/arc/t-arc-newlib| 46 - gcc/config/arc/t-arc-uClibc| 20 gcc/config/arc/t-multilib | 34 +++ gcc/config/arc/t-uClibc| 20 gcc/doc/invoke.texi| 90 +--- 23 files changed, 1247 insertions(+), 376 deletions(-) create mode 100644 gcc/config/arc/arc-arch.h create mode 100644 gcc/config/arc/arc-arches.def create mode 100644 gcc/config/arc/arc-cpus.def create mode 100644 gcc/config/arc/arc-options.def create mode 100644 gcc/config/arc/arc-tables.opt create mode 100644
[C++ PATCH] Fix postfix-expression parsing (PR c++/78089, take 2)
Hi! On Mon, Oct 24, 2016 at 04:20:25PM +0200, Jakub Jelinek wrote: > While writing a testcase for __builtin_launder, I've noticed we don't parse > __builtin_shuffle or __builtin_addressof properly (as well as do error > recovery e.g. for typeid in constant expression etc.). > > The problem is that several spots would return something; from the first > switch in cp_parser_postfix_expression, but that is not something we really > want, that primary expression part can be followed by [, ., -> etc. to form > postfix-expression, but this is done in the same function. So instead of > returning, we need to set postfix_expression to whatever we were returning > and just get out of the first switch, then continue parsing in the loop > after that. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > For release branches, __builtin_addressof doesn't apply and the > __builtin_shuffle code is differently formatted, but the bug is there > for __builtin_shuffle, ok to backport to the release branches too? The patch doesn't apply any longer, as the __builtin_launder patch has been committed earlier. So here is the same patch updated for current trunk, bootstrapped/regtested on x86_64-linux and i686-linux again, ok for trunk? 2016-10-31 Jakub JelinekPR c++/78089 * parser.c (cp_parser_postfix_expression): Replace return statement in the first switch with setting postfix_expression to the return expression and break;. * c-c++-common/builtin-shuffle-1.c: New test. * g++.dg/cpp0x/addressof3.C: New test. --- gcc/cp/parser.c.jj 2016-10-22 20:58:11.0 +0200 +++ gcc/cp/parser.c 2016-10-31 13:03:57.504037757 +0200 @@ -6441,7 +6441,10 @@ cp_parser_postfix_expression (cp_parser can be used in constant-expressions. */ if (!cast_valid_in_integral_constant_expression_p (type) && cp_parser_non_integral_constant_expression (parser, NIC_CAST)) - return error_mark_node; + { + postfix_expression = error_mark_node; + break; + } switch (keyword) { @@ -6521,7 +6524,7 @@ cp_parser_postfix_expression (cp_parser parser->type_definition_forbidden_message = saved_message; /* `typeid' may not appear in an integral constant expression. */ if (cp_parser_non_integral_constant_expression (parser, NIC_TYPEID)) - return error_mark_node; + postfix_expression = error_mark_node; } break; @@ -6615,7 +6618,10 @@ cp_parser_postfix_expression (cp_parser /*cast_p=*/false, /*allow_expansion_p=*/true, /*non_constant_p=*/NULL); if (vec == NULL) - return error_mark_node; + { + postfix_expression = error_mark_node; + break; + } FOR_EACH_VEC_ELT (*vec, i, p) mark_exp_read (p); @@ -6624,10 +6630,15 @@ cp_parser_postfix_expression (cp_parser { case RID_ADDRESSOF: if (vec->length () == 1) - return cp_build_addressof (loc, (*vec)[0], tf_warning_or_error); - error_at (loc, "wrong number of arguments to " - "%<__builtin_addressof%>"); - return error_mark_node; + postfix_expression + = cp_build_addressof (loc, (*vec)[0], tf_warning_or_error); + else + { + error_at (loc, "wrong number of arguments to " + "%<__builtin_addressof%>"); + postfix_expression = error_mark_node; + } + break; case RID_BUILTIN_LAUNDER: if (vec->length () == 1) @@ -6643,14 +6654,20 @@ cp_parser_postfix_expression (cp_parser case RID_BUILTIN_SHUFFLE: if (vec->length () == 2) - return build_x_vec_perm_expr (loc, (*vec)[0], NULL_TREE, - (*vec)[1], tf_warning_or_error); + postfix_expression + = build_x_vec_perm_expr (loc, (*vec)[0], NULL_TREE, +(*vec)[1], tf_warning_or_error); else if (vec->length () == 3) - return build_x_vec_perm_expr (loc, (*vec)[0], (*vec)[1], - (*vec)[2], tf_warning_or_error); - error_at (loc, "wrong number of arguments to " - "%<__builtin_shuffle%>"); - return error_mark_node; + postfix_expression + = build_x_vec_perm_expr (loc, (*vec)[0], (*vec)[1], +(*vec)[2], tf_warning_or_error); + else + { + error_at (loc, "wrong number of arguments to " + "%<__builtin_shuffle%>"); + postfix_expression = error_mark_node; + } + break; default:
RFA (tree-inline): PATCH for C++ inheriting constructors overhaul
wg21.link/p0136 significantly changes the specification of C++11 inheriting constructors so that they become an implementation detail rather than a language-level construct; instead, overload resolution and such are done on the constructor from the base, and the artificial constructor in the derived class comes in only later. As a result, several things that worked poorly with the old specification work much better. But this changes the representation as well; as a result, the mangled name of the artificial inherited constructor is changing. This proposal is part of C++17, but it is specified as a defect report against C++11 and C++14 as well, so the new behavior affects those modes as well. The change can be controlled with -fno-new-inheriting-ctors, and it is also disabled by default with -fabi-version=10 or below. One subtlety with the new scheme is that base constructor clones of an inheriting constructor that inherits from a virtual base don't get the inherited parameters, since they won't actually call the constructor for the virtual base; only the complete constructor clone does that. For the cloning code to handle omitted inherited parms, I needed to change copy_tree_body_r to immediately optimize away a COND_EXPR with a constant condition, as the non-taken branch would try to pass the omitted parameters to the virtual base constructor. Is the tree-inline.c patch OK for trunk? commit 1375d880435fc21023a812ad17da209e44526b07 Author: Jason MerrillDate: Tue Oct 25 23:53:51 2016 -0400 Implement P0136R1, Rewording inheriting constructors. gcc/c-family/ * c.opt (-fnew-inheriting-ctors): New. * c-opts.c: Default to on for ABI 11+. gcc/cp/ * call.c (enum rejection_reason_code): Add rr_inherited_ctor. (inherited_ctor_rejection): New. (add_function_candidate): Reject inherited ctors for copying. (enforce_access): Use strip_inheriting_ctors. (print_z_candidate): Likewise. Handle rr_inherited_ctor. (convert_like_real): Avoid copying inheriting ctor parameters. (build_over_call): Likewise. A base ctor inheriting from vbase has no parms. Sorry about varargs. (joust): A local constructor beats inherited with the same convs. * class.c (add_method): Handle hiding inheriting ctors. (one_inherited_ctor): Handle new semantics. (add_implicitly_declared_members): Pass using_decl down. (build_clone): A base ctor inheriting from vbase has no parms. * cp-tree.h (DECL_INHERITED_CTOR): Store this instead of the base. (SET_DECL_INHERITED_CTOR): Likewise. (DECL_INHERITED_CTOR_BASE): Adjust. * constexpr.c: Adjust. * error.c (dump_function_decl): Decorate inheriting ctors. * init.c (emit_mem_initializers): Suppress access control in inheriting ctor. * mangle.c (write_special_name_constructor): Handle new inheriting ctor mangling. * method.c (strip_inheriting_ctors, inherited_ctor_binfo) (ctor_omit_inherited_parms, binfo_inherited_from): New. (synthesized_method_walk): Use binfo_inherited_from. Suppress access control in inheriting ctor. (deduce_inheriting_ctor): Deleted if ambiguous ctor inheritance. (maybe_explain_implicit_delete): Explain ambigous ctor inheritance. (add_one_base_init, do_build_copy_constructor): Adjust. (locate_fn_flags, explain_implicit_non_constexpr): Adjust. (implicitly_declare_fn): Adjust. (get_inherited_ctor): Remove. * name-lookup.c (do_class_using_decl): Check for indirect ctor inheritance. * optimize.c (cdtor_comdat_group): Adjust for new mangling. (maybe_clone_body): Handle omitted parms in base clone. (maybe_thunk_body): Don't thunk if base clone omits parms. * pt.c (tsubst_decl): Adjust. (instantiate_template_1): Suppress access control in inheriting ctor. (fn_type_unification): Do deduction with inherited ctor. * tree.c (special_function_p): Adjust. gcc/ * tree-inline.c (copy_tree_body_r): Only copy the taken branch of a COND_EXPR with constant condition. libiberty/ * cp-demangle.c (d_ctor_dtor_name): Handle inheriting constructor. diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index c399307..969b05c 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -914,6 +914,12 @@ c_common_post_options (const char **pfilename) if (flag_abi_version == 0) flag_abi_version = 11; + /* By default, enable the new inheriting constructor semantics along with ABI + 11. New and old should coexist fine, but it is a change in what + artificial symbols are
Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)
On Mon, Oct 31, 2016 at 04:09:48PM +0100, Steven Bosscher wrote: > On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote: > > This patch solves this problem by simply running the > > duplicate_computed_gotos > > pass again, as long as it does any work. The patch looks much bigger than > > it is, because I factored out two routines to simplify the control flow. > > It's made the patch a bit difficult to read. Condensing it a bit... Well, it would have a goto crossing a loop, or two gotos crossing each other, otherwise. I should have done it as two patches I guess (first factor, then change). > > + for (;;) > > { > > + if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1) > > + return 0; > > This test should not be needed in the loop. This pass can never > collapse the function to a single basic block. Yeah maybe, but that relies on quite a few assumptions. This is strictly an optimisation anyway, will move it outside the loop. > > + basic_block bb; > > + FOR_EACH_BB_FN (bb, fun) > > + { > > + /* Build the reorder chain for the original order of blocks. */ > > + if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun)) > > + bb->aux = bb->next_bb; > > + } > > > > + duplicate_computed_gotos_find_candidates (fun, candidates, max_size); > > > > + bool changed = false; > > + if (!bitmap_empty_p (candidates)) > > + changed = duplicate_computed_gotos_do_duplicate (fun, candidates); > > > > + if (changed) > > + fixup_partitions (); > > + > > + cfg_layout_finalize (); > > I don't think you have to go into/out-of cfglayout mode for each iteration. Yeah probably. I was too lazy :-) It needs the cleanup_cfg every iteration though, I was afraid that interacts. > >/* Merge the duplicated blocks into predecessors, when possible. */ > > + if (changed) > > + cleanup_cfg (0); > > + else > > + break; > > } > > Maybe a gcc_assert that the loop doesn't iterate more often than num_edges? Good plan (num blocks even). Thanks, Segher
Re: [PATCH] DWARF5 .debug_line{,_str} support
On 10/19/2016 07:30 PM, Jakub Jelinek wrote: This patch adds support for DWARF5 .debug_line{,_str} section format, though only if !DWARF2_ASM_LINE_DEBUG_INFO, because otherwise .debug_line is emitted by the assembler. For that we'll need some coordination with gas, dunno if we want a new as directive for that, or as command line switch, or perhaps key DWARF5 .debug_line on the presence of .debug_line_str section (though the last one probably isn't a good idea, because -gsplit-dwarf doesn't use .debug_line_str). Could gas pay attention to the DWARF version in a unit header? + if (ndirs + idx_offset <= 256) + idx_form = DW_FORM_data1; + else if (ndirs + idx_offset <= 65536) + { + unsigned HOST_WIDE_INT sum = 1; + for (i = 0; i < numfiles; i++) + { + int file_idx = backmap[i]; + int dir_idx = dirs[files[file_idx].dir_idx].dir_idx; + sum += size_of_uleb128 (dir_idx); + } + if (sum >= HOST_WIDE_INT_UC (2) * (numfiles + 1)) + idx_form = DW_FORM_data2; + } Why can we choose DW_FORM_data1 based only on ndirs+idx_offset, but we need to look at the files table to choose DW_FORM_data2? This at least needs a comment. + dw2_asm_output_data (idx_form == DW_FORM_data1 ? 1 : 2, +0, NULL); It seems that we could use an output_data function that takes a DW_FORM as one of its arguments and does the appropriate thing. + if (dwarf_version >= 5 && str_form == DW_FORM_line_strp) + { + node = find_AT_string_in_table (filename0, debug_line_str_hash); + set_indirect_string (node); + node->form = DW_FORM_line_strp; + dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, +debug_line_str_section, +"File Entry: %#x: \"%s\"", 0, node->str); + } + else + dw2_asm_output_nstring (filename0, -1, "File Entry: %#x", 0); Please factor this out into a separate function. + if (!DWARF2_ASM_LINE_DEBUG_INFO + && dwarf_version >= 5 + && !(DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET + || (DEBUG_STR_SECTION_FLAGS & SECTION_MERGE) == 0 + /* FIXME: For now. */ + || dwarf_split_debug_info)) This should also be a separate function or macro, since it's used more than once. And please elaborate the FIXME comment. Jason
Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)
On Sun, Oct 30, 2016 at 8:10 PM, Segher Boessenkool wrote: > This patch solves this problem by simply running the duplicate_computed_gotos > pass again, as long as it does any work. The patch looks much bigger than > it is, because I factored out two routines to simplify the control flow. It's made the patch a bit difficult to read. Condensing it a bit... > + for (;;) > { > + if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1) > + return 0; This test should not be needed in the loop. This pass can never collapse the function to a single basic block. > + clear_bb_flags (); > + cfg_layout_initialize (0); See comment below... > + basic_block bb; > + FOR_EACH_BB_FN (bb, fun) > + { > + /* Build the reorder chain for the original order of blocks. */ > + if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun)) > + bb->aux = bb->next_bb; > + } > > + duplicate_computed_gotos_find_candidates (fun, candidates, max_size); > > + bool changed = false; > + if (!bitmap_empty_p (candidates)) > + changed = duplicate_computed_gotos_do_duplicate (fun, candidates); > > + if (changed) > + fixup_partitions (); > + > + cfg_layout_finalize (); I don't think you have to go into/out-of cfglayout mode for each iteration. >/* Merge the duplicated blocks into predecessors, when possible. */ > + if (changed) > + cleanup_cfg (0); > + else > + break; > } Maybe a gcc_assert that the loop doesn't iterate more often than num_edges? Ciao! Steven
Re: [PATCH] fix linker name for uClibc
On 10/31/2016 08:04 AM, Jeff Law wrote: On 10/28/2016 03:30 PM, Michael Eager wrote: On 10/28/2016 11:14 AM, Waldemar Brodkorb wrote: Hi, uClibc-ng can be used for Microblaze architecture. It is regulary tested with qemu-system-microblaze in little and big endian mode. 2016-10-28 Waldemar Brodkorbgcc/ * config/microblaze/linux.h: add UCLIBC_DYNAMIC_LINKER diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h index ae8523c..b3bf43a 100644 --- a/gcc/config/microblaze/linux.h +++ b/gcc/config/microblaze/linux.h @@ -29,6 +29,7 @@ #define TLS_NEEDS_GOT 1 #define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1" +#define UCLIBC_DYNAMIC_LINKER "/lib/ld-uClibc.so.0" #if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */ #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:;:el}" best regards Waldemar OK to apply. Thanks. I don't believe Waldemar has write access, so I committed the patch for him. Thanks. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: MAINTAINERS update
On Sat, 2016-10-29 at 12:10 -0500, Segher Boessenkool wrote: Hi Carl, > > On Fri, Oct 28, 2016 at 08:11:11AM -0700, Carl E. Love wrote: > > Manuel López-Ibáñez> > Martin v. Löwis > > H.J. Lu > > +Carl Love > > Christophe Lyon > > Luis Machado > > Ziga Mahkovec > > That is not alphabetical... Move it two up? > > Segher Seger: Yup, it is in the wrong order. I have updated the MAINTAINERS file and the ChangeLog. Carl Love - Index: ChangeLog === --- ChangeLog (revision 241702) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2016-10-27 Carl Love + * MAINTAINERS (Write After Approval): Fix my entry in the Write After + Approval list to make it alphabetical. + +2016-10-27 Carl Love + * MAINTAINERS (Write After Approval): Add myself. 2016-10-27 Andrew Burgess Index: MAINTAINERS === --- MAINTAINERS (revision 241702) +++ MAINTAINERS (working copy) @@ -477,9 +477,9 @@ Gabor Loki Sandra Loosemore Manuel López-Ibáñez +Carl Love Martin v. Löwis H.J. Lu -Carl Love Christophe Lyon Luis Machado Ziga Mahkovec
Re: [PATCH] fix linker name for uClibc
On 10/28/2016 03:30 PM, Michael Eager wrote: On 10/28/2016 11:14 AM, Waldemar Brodkorb wrote: Hi, uClibc-ng can be used for Microblaze architecture. It is regulary tested with qemu-system-microblaze in little and big endian mode. 2016-10-28 Waldemar Brodkorbgcc/ * config/microblaze/linux.h: add UCLIBC_DYNAMIC_LINKER diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h index ae8523c..b3bf43a 100644 --- a/gcc/config/microblaze/linux.h +++ b/gcc/config/microblaze/linux.h @@ -29,6 +29,7 @@ #define TLS_NEEDS_GOT 1 #define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1" +#define UCLIBC_DYNAMIC_LINKER "/lib/ld-uClibc.so.0" #if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */ #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:;:el}" best regards Waldemar OK to apply. Thanks. I don't believe Waldemar has write access, so I committed the patch for him. jeff
Re: [PATCH] Fix host_size_t_cst_p predicate
On 10/31/2016 12:11 PM, Richard Biener wrote: > On Mon, Oct 31, 2016 at 11:18 AM, Richard Sandiford >wrote: >> Richard Biener writes: >>> On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandiford >>> wrote: Richard Biener writes: > On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška wrote: >> On 10/31/2016 01:12 AM, Richard Sandiford wrote: >>> Richard Biener writes: On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška wrote: > On 10/27/2016 03:35 PM, Richard Biener wrote: >> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška wrote: >>> Running simple test-case w/o the proper header file causes ICE: >>> strncmp ("a", "b", -1); >>> >>> 0xe74462 tree_to_uhwi(tree_node const*) >>> ../../gcc/tree.c:7324 >>> 0x90a23f host_size_t_cst_p >>> ../../gcc/fold-const-call.c:63 >>> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*, >>> tree_node*, tree_node*) >>> ../../gcc/fold-const-call.c:1512 >>> 0x787b01 fold_builtin_3 >>> ../../gcc/builtins.c:8385 >>> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, >>> int, bool) >>> ../../gcc/builtins.c:8465 >>> 0x9052b1 fold(tree_node*) >>> ../../gcc/fold-const.c:11919 >>> 0x6de2bb c_fully_fold_internal >>> ../../gcc/c/c-fold.c:185 >>> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*) >>> ../../gcc/c/c-fold.c:90 >>> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*) >>> ../../gcc/c/c-typeck.c:10369 >>> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*) >>> ../../gcc/c/c-typeck.c:10414 >>> 0x6cb578 c_parser_statement_after_labels >>> ../../gcc/c/c-parser.c:5430 >>> 0x6cd333 c_parser_compound_statement_nostart >>> ../../gcc/c/c-parser.c:4944 >>> 0x6cdbde c_parser_compound_statement >>> ../../gcc/c/c-parser.c:4777 >>> 0x6c93ac c_parser_declaration_or_fndef >>> ../../gcc/c/c-parser.c:2176 >>> 0x6d51ab c_parser_external_declaration >>> ../../gcc/c/c-parser.c:1574 >>> 0x6d5c09 c_parser_translation_unit >>> ../../gcc/c/c-parser.c:1454 >>> 0x6d5c09 c_parse_file() >>> ../../gcc/c/c-parser.c:18173 >>> 0x72ffd2 c_common_parse_file() >>> ../../gcc/c-family/c-opts.c:1087 >>> >>> Following patch improves the host_size_t_cst_p predicate. >>> >>> Patch can bootstrap on ppc64le-redhat-linux and survives >>> regression tests. >>> >>> Ready to be installed? >> >> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * >> CHAR_BIT test is now redundant. >> >> OTOH it was probably desired to allow -1 here? A little looking back >> in time should tell. > > Ok, it started with r229922, where it was changed from: > > if (tree_fits_uhwi_p (len) && p1 && p2) > { > const int i = strncmp (p1, p2, tree_to_uhwi (len)); > ... > > to current version: > > case CFN_BUILT_IN_STRNCMP: > { > bool const_size_p = host_size_t_cst_p (arg2, ); > > Thus I'm suggesting to change to back to it. > > Ready to be installed? Let's ask Richard. >>> >>> The idea with the: >>> >>> wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT >>> >>> test was to stop us attempting 64-bit size_t operations on ILP32 hosts. >>> I think we still want that. >> >> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current >> wi::min_precision check, right? > > Not sure. If we have host_size_t_cst_p then we should have a > corresponding > size_t host_size_t (const_tree) and should use those in pairs. Not sure > why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p. It's the other way around: something can satisfy tree_fits_uhwi_p (i.e. fit within a uint64_t) but not fit within the host's size_t. The kind of case I'm thinking of is: strncmp ("fi", "fo", (1L << 32) + 1) for an ILP32 host and LP64 target. There's a danger that by passing the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd truncate it to 1, giving the wrong result. >>> >>> Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then? >>> (unless we have a > 64bit host size_t). >> >> Because in Martin's test
[PATCH 2/2, i386]: Implement TARGET_EXPAND_DIVMOD_LIBFUNC
Attached patch builds on previous libgcc patch and implements TARGET_EXPAND_DIVMOD_LIBFUNC target hook. The same approach can also be used on other targets without hardware divmod insn. Patch also fixes several gcc.dg/divmod-?.c testsuite failures for x86_64 multilibs. 2016-10-31 Uros Bizjak* config/i386/i386.c (ix86_init_libfuncs): New. Call darwin_rename_builtins here. (ix86_expand_divmod_libfunc): New. (TARGET_INIT_LIBFUNCS): Unconditionally define to ix86_init_libfuncs. (TARGET_EXPAND_DIVMOD_LIBFUNC): Define. testsuite/Changelog: 2016-10-31 Uros Bizjak * lib/target-supports.exp (check_effective_target_divmod): Add i?86-*-*. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 241697) +++ config/i386/i386.c (working copy) @@ -50571,9 +50571,44 @@ ix86_addr_space_zero_address_valid (addr_space_t a { return as != ADDR_SPACE_GENERIC; } -#undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID -#define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID ix86_addr_space_zero_address_valid +static void +ix86_init_libfuncs (void) +{ + if (TARGET_64BIT) +{ + set_optab_libfunc (sdivmod_optab, TImode, "__divmodti4"); + set_optab_libfunc (udivmod_optab, TImode, "__udivmodti4"); +} + else +{ + set_optab_libfunc (sdivmod_optab, DImode, "__divmoddi4"); + set_optab_libfunc (udivmod_optab, DImode, "__udivmoddi4"); +} + +#if TARGET_MACHO + darwin_rename_builtins (); +#endif +} + +/* Generate call to __divmoddi4. */ + +static void +ix86_expand_divmod_libfunc (rtx libfunc, machine_mode mode, + rtx op0, rtx op1, + rtx *quot_p, rtx *rem_p) +{ + rtx rem = assign_386_stack_local (mode, SLOT_TEMP); + + rtx quot = emit_library_call_value (libfunc, NULL_RTX, LCT_NORMAL, + mode, 3, + op0, GET_MODE (op0), + op1, GET_MODE (op1), + XEXP (rem, 0), Pmode); + *quot_p = quot; + *rem_p = rem; +} + /* Initialize the GCC target structure. */ #undef TARGET_RETURN_IN_MEMORY #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory @@ -50957,11 +50992,6 @@ ix86_addr_space_zero_address_valid (addr_space_t a #undef TARGET_CONDITIONAL_REGISTER_USAGE #define TARGET_CONDITIONAL_REGISTER_USAGE ix86_conditional_register_usage -#if TARGET_MACHO -#undef TARGET_INIT_LIBFUNCS -#define TARGET_INIT_LIBFUNCS darwin_rename_builtins -#endif - #undef TARGET_LOOP_UNROLL_ADJUST #define TARGET_LOOP_UNROLL_ADJUST ix86_loop_unroll_adjust @@ -51052,6 +51082,15 @@ ix86_addr_space_zero_address_valid (addr_space_t a #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1 +#undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID +#define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID ix86_addr_space_zero_address_valid + +#undef TARGET_INIT_LIBFUNCS +#define TARGET_INIT_LIBFUNCS ix86_init_libfuncs + +#undef TARGET_EXPAND_DIVMOD_LIBFUNC +#define TARGET_EXPAND_DIVMOD_LIBFUNC ix86_expand_divmod_libfunc + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-i386.h" Index: testsuite/lib/target-supports.exp === --- testsuite/lib/target-supports.exp (revision 241697) +++ testsuite/lib/target-supports.exp (working copy) @@ -8110,7 +8110,7 @@ proc check_effective_target_divmod { } { #TODO: Add checks for all targets that have either hardware divmod insn # or define libfunc for divmod. if { [istarget arm*-*-*] -|| [istarget x86_64-*-*] } { +|| [istarget x86_64-*-*] || [istarget i?86-*-*] } { return 1 } return 0
[PATCH 1/2, libgcc]: Implement _divmoddi4
This function will be used in a follow-up patch to implement TARGET_EXPAND_DIVMOD_LIBFUNC for x86 targets. Other targets can call this function, so IMO it should be part of a generic library. 2016-10-31 Uros Bizjak* Makefile.in (LIB2_DIVMOD_FUNCS): Add _divmoddi4. * libgcc2.c (__divmoddi4): New function. * libgcc2.h (__divmoddi4): Declare. * libgcc-std.ver.in (GCC_7.0.0): New. Add __PFX_divmoddi4 and __PFX_divmodti4. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} with a follow-up target-dependent patch. OK for mainline? Uros. Index: Makefile.in === --- Makefile.in (revision 241697) +++ Makefile.in (working copy) @@ -255,10 +255,6 @@ LIB2FUNCS_ST = _eprintf __gcc_bcmp # List of functions not to build from libgcc2.c. LIB2FUNCS_EXCLUDE = -# These might cause a divide overflow trap and so are compiled with -# unwinder info. -LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4 - # List of extra C and assembler files to add to static and shared libgcc2. # Assembler files should have names ending in `.S'. LIB2ADD = @@ -455,7 +451,8 @@ endif # These might cause a divide overflow trap and so are compiled with # unwinder info. -LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4 +LIB2_DIVMOD_FUNCS = _divdi3 _moddi3 _divmoddi4 \ + _udivdi3 _umoddi3 _udivmoddi4 _udiv_w_sdiv # Remove any objects from lib2funcs and LIB2_DIVMOD_FUNCS that are # defined as optimized assembly code in LIB1ASMFUNCS or as C code Index: libgcc-std.ver.in === --- libgcc-std.ver.in (revision 241697) +++ libgcc-std.ver.in (working copy) @@ -1938,3 +1938,9 @@ GCC_4.7.0 { %inherit GCC_4.8.0 GCC_4.7.0 GCC_4.8.0 { } + +%inherit GCC_7.0.0 GCC_4.8.0 +GCC_7.0.0 { + __PFX__divmoddi4 + __PFX__divmodti4 +} Index: libgcc2.c === --- libgcc2.c (revision 241697) +++ libgcc2.c (working copy) @@ -680,7 +680,8 @@ __udiv_w_sdiv (UWtype *rp __attribute__ ((__unused #endif #if (defined (L_udivdi3) || defined (L_divdi3) || \ - defined (L_umoddi3) || defined (L_moddi3)) + defined (L_umoddi3) || defined (L_moddi3) || \ + defined (L_divmoddi4)) #define L_udivmoddi4 #endif @@ -937,7 +938,8 @@ __parityDI2 (UDWtype x) #ifdef TARGET_HAS_NO_HW_DIVIDE #if (defined (L_udivdi3) || defined (L_divdi3) || \ - defined (L_umoddi3) || defined (L_moddi3)) + defined (L_umoddi3) || defined (L_moddi3) || \ + defined (L_divmoddi4)) static inline __attribute__ ((__always_inline__)) #endif UDWtype @@ -1004,7 +1006,8 @@ __udivmoddi4 (UDWtype n, UDWtype d, UDWtype *rp) #else #if (defined (L_udivdi3) || defined (L_divdi3) || \ - defined (L_umoddi3) || defined (L_moddi3)) + defined (L_umoddi3) || defined (L_moddi3) || \ + defined (L_divmoddi4)) static inline __attribute__ ((__always_inline__)) #endif UDWtype @@ -1269,6 +1272,34 @@ __moddi3 (DWtype u, DWtype v) } #endif +#ifdef L_divmoddi4 +DWtype +__divmoddi4 (DWtype u, DWtype v, DWtype *rp) +{ + Wtype c1 = 0, c2 = 0; + DWunion uu = {.ll = u}; + DWunion vv = {.ll = v}; + DWtype w; + DWtype r; + + if (uu.s.high < 0) +c1 = ~c1, c2 = ~c2, +uu.ll = -uu.ll; + if (vv.s.high < 0) +c1 = ~c1, +vv.ll = -vv.ll; + + w = __udivmoddi4 (uu.ll, vv.ll, (UDWtype*)); + if (c1) +w = -w; + if (c2) +r = -r; + + *rp = r; + return w; +} +#endif + #ifdef L_umoddi3 UDWtype __umoddi3 (UDWtype u, UDWtype v) Index: libgcc2.h === --- libgcc2.h (revision 241697) +++ libgcc2.h (working copy) @@ -281,6 +281,7 @@ typedef int shift_count_type __attribute__((mode ( #define __ashrdi3 __NDW(ashr,3) #define __cmpdi2 __NDW(cmp,2) #define __ucmpdi2 __NDW(ucmp,2) +#define __divmoddi4__NDW(divmod,4) #define __udivmoddi4 __NDW(udivmod,4) #define __fixunstfDI __NDW(fixunstf,) #define __fixtfdi __NDW(fixtf,) @@ -376,10 +377,12 @@ extern DWtype __divdi3 (DWtype, DWtype); extern UDWtype __udivdi3 (UDWtype, UDWtype); extern UDWtype __umoddi3 (UDWtype, UDWtype); extern DWtype __moddi3 (DWtype, DWtype); +extern DWtype __divmoddi4 (DWtype, DWtype, DWtype *); /* __udivmoddi4 is static inline when building other libgcc2 portions. */ #if (!defined (L_udivdi3) && !defined (L_divdi3) && \ - !defined (L_umoddi3) && !defined (L_moddi3)) + !defined (L_umoddi3) && !defined (L_moddi3) && \ + !defined (L_divmoddi4)) extern UDWtype __udivmoddi4 (UDWtype, UDWtype, UDWtype *); #endif
Re: [PATCH] Emit DW_AT_inline for C++17 inline variables
On Mon, Oct 31, 2016 at 10:25 AM, Jakub Jelinekwrote: > On Mon, Oct 31, 2016 at 09:52:28AM -0400, Jason Merrill wrote: >> On 10/14/2016 01:33 PM, Jakub Jelinek wrote: >> >This also uses the infrastructure of the langhook patch I've sent earlier. >> >It emits (if not strict dwarf) DW_AT_inline on explicit or implicit inline >> >variables, and also tweaks dwarf2out so that for inline static data members >> >we consider in-class declarations as definitions (emit DW_AT_linkage_name >> >and no DW_AT_declaration) for them. >> >> Hmm, so there's no DW_TAG_variable for the inline static data member? That >> seems problematic to me. The DWARF 3 convention that static data members >> use DW_TAG_member seems rather misguided, since in fact they are variables. >> Why did that change? > > I had to bisect it, apparently it has been changed by myself in > r145770, rationale in thread starting at > https://gcc.gnu.org/ml/gcc-patches/2009-04/msg00039.html > > The current DWARF 5 wording is: > "If the variable entry represents the defining declaration for a C++ static > data > member of a structure, class or union, the entry has a DW_AT_specification > attribute, whose value is a reference to the debugging information entry > representing the declaration of this data member. The referenced entry has > the tag DW_TAG_member and will be a child of some class, structure or > union type entry." on page 98 in DWARF5_Public_Review.pdf. Yes, this changed in DWARF 3; DWARF 2 didn't specify the tag. I think this was a mistake. > Incidentally, I've filed today a DWARF issue that Appendix A doesn't list > for DW_TAG_member lots of attributes that are allowed for DW_TAG_variable > and are useful for static data members. Using DW_TAG_variable would address that, too. Jason
Re: [PATCH] DWARF5 DW_FORM_implicit_const support
On 10/17/2016 05:45 PM, Jakub Jelinek wrote: +/* First abbrev_id that can be optimized based on usage. */ +static unsigned int abbrev_opt_start; + +/* Vector of usage counts during build_abbrev_table. Indexed by + abbrev_id - abbrev_opt_start. */ +static vec abbrev_usage_count; + +/* Vector of all DIEs added with die_abbrev >= abbrev_opt_start. */ +static vec abbrev_dies; I think these could use a comment that they are temporary data only live between build_abbrev_table and optimize_abbrev_table. And perhaps the abbrev_dies variable should be named something like sorted_abbrev_dies to avoid confusion with abbrev_die_table. OK with that change. Jason
[Patch, testsuite] Skip gcc.dg/lto/pr60449_0.c for avr
Hi, gcc.dg/lto/pr60449_0.c fails to link for the avr target because it doesn't have an implementation for gettimeofday. This patch therefore skips the test for avr. Committed to trunk as obvious. Regards Senthil gcc/testsuite/ChangeLog 2016-10-31 Senthil Kumar Selvaraj* gcc.dg/lto/pr60449_0.c: Skip for avr. Index: gcc/testsuite/gcc.dg/lto/pr60449_0.c === --- gcc/testsuite/gcc.dg/lto/pr60449_0.c(revision 241697) +++ gcc/testsuite/gcc.dg/lto/pr60449_0.c(working copy) @@ -1,4 +1,5 @@ /* { dg-lto-do link } */ +/* { dg-skip-if "Needs gettimeofday" { "avr-*-*" } } */ extern int printf (const char *__restrict __format, ...); typedef long int __time_t;
Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable
On 31/10/16 13:42, Richard Earnshaw (lists) wrote: On 31/10/16 11:54, Kyrill Tkachov wrote: On 24/10/16 17:15, Andrew Pinski wrote: On Mon, Oct 24, 2016 at 7:27 AM, Kyrill Tkachovwrote: Hi all, When storing a 64-bit immediate that has equal bottom and top halves we currently synthesize the repeating 32-bit pattern twice and perform a single X-store. With this patch we synthesize the 32-bit pattern once into a W register and store that twice using an STP. This reduces codesize bloat from synthesising the same constant multiple times at the expense of converting a store to a store-pair. It will only trigger if we can save two or more instructions, so it will only transform: mov x1, 49370 movkx1, 0xc0da, lsl 32 str x1, [x0] into: mov w1, 49370 stp w1, w1, [x0] when optimising for -Os, whereas it will always transform a 4-insn synthesis sequence into a two-insn sequence + STP (see comments in the patch). This patch triggers already but will trigger more with the store merging pass that I'm working on since that will generate more of these repeating 64-bit constants. This helps improve codegen on 456.hmmer where store merging can sometimes create very complex repeating constants and target-specific expand needs to break them down. Doing STP might be worse on ThunderX 1 than the mov/movk. Or this might cause an ICE with -mcpu=thunderx; I can't remember if the check for slow unaligned store pair word is with the pattern or not. I can't get it to ICE with -mcpu=thunderx. The restriction is just on the STP forming code in the sched-fusion hooks AFAIK. Basically the rule is 1) if 4 byte aligned, then it is better to do two str. 2) If 8 byte aligned, then doing stp is good 3) Otherwise it is better to do two str. Ok, then I'll make the function just emit two stores and depend on the sched-fusion machinery to fuse them into an STP when appropriate since that has the logic that takes thunderx into account. If the mode is DImode (ie the pattern is 'movdi', then surely we must have a 64-bit aligned store. I don't think that's guaranteed. At least the Standard Names documentation doesn't mention it. I've gone through an example with gdb where store merging produces an unaligned store and the gen_movdi expander is called with a source of (const_int 8589934593 [0x20001]) and a destination of: (mem:DI (reg/v/f:DI 73 [ a ]) [1 MEM[(int *)a_2(D)]+0 S8 A32]) i.e. a 32-bit aligned DImode MEM. Thanks, Kyrill R. Thanks for the info. Kyrill Thanks, Andrew Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2016-10-24 Kyrylo Tkachov * config/aarch64/aarch64.md (mov): Call aarch64_split_dimode_const_store on DImode constant stores. * config/aarch64/aarch64-protos.h (aarch64_split_dimode_const_store): New prototype. * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New function. 2016-10-24 Kyrylo Tkachov * gcc.target/aarch64/store_repeating_constant_1.c: New test. * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.
Re: [PATCH] Emit DW_AT_inline for C++17 inline variables
On Mon, Oct 31, 2016 at 09:52:28AM -0400, Jason Merrill wrote: > On 10/14/2016 01:33 PM, Jakub Jelinek wrote: > >This also uses the infrastructure of the langhook patch I've sent earlier. > >It emits (if not strict dwarf) DW_AT_inline on explicit or implicit inline > >variables, and also tweaks dwarf2out so that for inline static data members > >we consider in-class declarations as definitions (emit DW_AT_linkage_name > >and no DW_AT_declaration) for them. > > Hmm, so there's no DW_TAG_variable for the inline static data member? That > seems problematic to me. The DWARF 3 convention that static data members > use DW_TAG_member seems rather misguided, since in fact they are variables. > Why did that change? I had to bisect it, apparently it has been changed by myself in r145770, rationale in thread starting at https://gcc.gnu.org/ml/gcc-patches/2009-04/msg00039.html The current DWARF 5 wording is: "If the variable entry represents the defining declaration for a C++ static data member of a structure, class or union, the entry has a DW_AT_specification attribute, whose value is a reference to the debugging information entry representing the declaration of this data member. The referenced entry has the tag DW_TAG_member and will be a child of some class, structure or union type entry." on page 98 in DWARF5_Public_Review.pdf. And page 117 has in 5.7.6: "A data member (as opposed to a member function) is represented by a debugging information entry with the tag DW_TAG_member." So, using DW_TAG_variable as DW_TAG_{structure,class}_type children for the non-inline static data members would violate these two. C++17 is outside of the scope of DWARF5 I guess, so the question is to find out what do we want for those and agree on that with consumers and other producers. Incidentally, I've filed today a DWARF issue that Appendix A doesn't list for DW_TAG_member lots of attributes that are allowed for DW_TAG_variable and are useful for static data members. Some of them we want anyway, at least unless we agree on the non-inline static data members no longer be DW_TAG_member in the structure/class type and change the standard. In particular DW_AT_const_value and DW_AT_const_expr are desirable even for the static data members in C++98..14 (or non-inline ones in C++17). The rest, like DW_AT_linkage_name, DW_AT_location, etc. are generally more useful mainly for the inline ones. Jakub
[PATCH] Fix PR78047
I am testing the following to fix another latent PTA bug uncovered by pointer comparsion folding. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2016-10-31 Richard BienerPR tree-optimization/78047 * tree-ssa-structalias.c (push_fields_onto_fieldstack): Initialize fake field at offset zero conservatively regarding to may_have_pointers. diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c index fb364f1..2880382 100644 --- a/gcc/tree-ssa-structalias.c +++ b/gcc/tree-ssa-structalias.c @@ -5566,7 +5568,7 @@ push_fields_onto_fieldstack (tree type, vec *fieldstack, && offset + foff != 0) { fieldoff_s e - = {0, offset + foff, false, false, false, false, NULL_TREE}; + = {0, offset + foff, false, false, true, false, NULL_TREE}; pair = fieldstack->safe_push (e); }
Re: [PATCH] DWARF5 DW_FORM_data16 support
On 10/17/2016 05:39 PM, Jakub Jelinek wrote: (dwarf_version >= 5 ? 128 : 64) Please make this a macro. OK with that change. Jason
Re: [PATCH] DWARF5 DW_TAG_call_site support
OK. Jason
Re: [PATCH] Emit DWARF5 DW_AT_reference and DW_AT_rvalue_reference
On 10/24/2016 03:09 PM, Jakub Jelinek wrote: On Mon, Oct 24, 2016 at 02:34:04PM -0400, Jason Merrill wrote: On Mon, Oct 24, 2016 at 10:29 AM, Jakub Jelinekwrote: This is another addition in DWARF5. The patch emits these attributes only for DW_TAG_subprogram for non-static ref-qualified member functions. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Thanks, committed. We really should emit it also for DW_TAG_subroutine_type for PMF types, the problem is that for FUNCTION_TYPE/METHOD_TYPE, dwarf2out.c uses type_main_variant and get_qualified_type, neither of them understand ref-qualifies. I think we'd need to change get_qualified_type to use some lang-hook which for C++ would also check the ref qualifiers, and then for these two avoid type_main_variant and instead use get_qualified_type to get the non-const/volatile/restrict etc. qualified one (and only then we can use the decl_dwarf_attribute langhook to ask the FE whether the attribute should be emitted). Does that make sense, or do you have another approach? Can we pull out the ref-qualifier before we clobber the qualifiers on the type? We could handle it like "reverse" flag, basically replace bool reverse with int that would hold bitmask of the flags, but I really don't understand how the reverse flag works right now, I don't see how it can work reliably when we use and rely on equate_type_number_to_die/lookup_type_die mechanism where we store just a single DIE for each type. We either equate it to a type with DW_AT_endianity present, or not, and then the rest of the lookups will be just random. I'm afraid to make it work reliably we'd need to somehow store both the "reverse" and "!reverse" DIEs for each type DIE, and similarly for the & and && flags. Or by changing get_qualified_die (in particular check_base_type) to use a langhook, we could at least for DW_AT_{reference,rvalue_reference} just use equate_type_number_to_die/lookup_type_die reliably (DW_AT_endianity issue remains). Yeah, I think that adding a langhook is the right way to go. The generic code clobbering C++ qualifiers is a frequent headache. Jason
Re: [PATCH] For -gdwarf-5 emit DWARF5 .debug_macro
OK.
Re: [PATCH] Emit DW_AT_inline for C++17 inline variables
On 10/14/2016 01:33 PM, Jakub Jelinek wrote: This also uses the infrastructure of the langhook patch I've sent earlier. It emits (if not strict dwarf) DW_AT_inline on explicit or implicit inline variables, and also tweaks dwarf2out so that for inline static data members we consider in-class declarations as definitions (emit DW_AT_linkage_name and no DW_AT_declaration) for them. Hmm, so there's no DW_TAG_variable for the inline static data member? That seems problematic to me. The DWARF 3 convention that static data members use DW_TAG_member seems rather misguided, since in fact they are variables. Why did that change? Jason
Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable
On 31/10/16 11:54, Kyrill Tkachov wrote: > > On 24/10/16 17:15, Andrew Pinski wrote: >> On Mon, Oct 24, 2016 at 7:27 AM, Kyrill Tkachov >>wrote: >>> Hi all, >>> >>> When storing a 64-bit immediate that has equal bottom and top halves we >>> currently >>> synthesize the repeating 32-bit pattern twice and perform a single >>> X-store. >>> With this patch we synthesize the 32-bit pattern once into a W >>> register and >>> store >>> that twice using an STP. This reduces codesize bloat from >>> synthesising the >>> same >>> constant multiple times at the expense of converting a store to a >>> store-pair. >>> It will only trigger if we can save two or more instructions, so it will >>> only transform: >>> mov x1, 49370 >>> movkx1, 0xc0da, lsl 32 >>> str x1, [x0] >>> >>> into: >>> >>> mov w1, 49370 >>> stp w1, w1, [x0] >>> >>> when optimising for -Os, whereas it will always transform a 4-insn >>> synthesis >>> sequence into a two-insn sequence + STP (see comments in the patch). >>> >>> This patch triggers already but will trigger more with the store merging >>> pass >>> that I'm working on since that will generate more of these repeating >>> 64-bit >>> constants. >>> This helps improve codegen on 456.hmmer where store merging can >>> sometimes >>> create very >>> complex repeating constants and target-specific expand needs to break >>> them >>> down. >> >> Doing STP might be worse on ThunderX 1 than the mov/movk. Or this >> might cause an ICE with -mcpu=thunderx; I can't remember if the check >> for slow unaligned store pair word is with the pattern or not. > > I can't get it to ICE with -mcpu=thunderx. > The restriction is just on the STP forming code in the sched-fusion > hooks AFAIK. > >> Basically the rule is >> 1) if 4 byte aligned, then it is better to do two str. >> 2) If 8 byte aligned, then doing stp is good >> 3) Otherwise it is better to do two str. > > Ok, then I'll make the function just emit two stores and depend on the > sched-fusion > machinery to fuse them into an STP when appropriate since that has the > logic that > takes thunderx into account. If the mode is DImode (ie the pattern is 'movdi', then surely we must have a 64-bit aligned store. R. > > Thanks for the info. > Kyrill > >> >> Thanks, >> Andrew >> >> >>> Bootstrapped and tested on aarch64-none-linux-gnu. >>> >>> Ok for trunk? >>> >>> Thanks, >>> Kyrill >>> >>> 2016-10-24 Kyrylo Tkachov >>> >>> * config/aarch64/aarch64.md (mov): Call >>> aarch64_split_dimode_const_store on DImode constant stores. >>> * config/aarch64/aarch64-protos.h >>> (aarch64_split_dimode_const_store): >>> New prototype. >>> * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New >>> function. >>> >>> 2016-10-24 Kyrylo Tkachov >>> >>> * gcc.target/aarch64/store_repeating_constant_1.c: New test. >>> * gcc.target/aarch64/store_repeating_constant_2.c: Likewise. >
Re: [PATCH 0/3] use rtx_insn * more
On 10/28/2016 01:13 PM, tbsaunde+...@tbsaunde.org wrote: From: Trevor SaundersHI, This series changes various variables type from rtx to rtx_insn * so that the remaining patches in this series http://gcc.gnu.org/ml/gcc-patches/2016-10/msg01353.html can be applied. patches bootstrapped and regtested on x86_64-linux-gnu, and run through config-list.mk, ok? Thanks! Trev Trevor Saunders (3): use rtx_insn * in various places where it is obvious split up the trial variable in reorg.c:relax_delay_slots to use rtx_insn * more split up some variables to use rtx_insn * more All 3 patches in this series are fine. Thanks, Jeff
Re: [C++ PATCH] Fix -Wimplicit-fallthrough in templates (PR c++/77886)
OK.
Re: [C++ PATCH] Add tests for a const member and a reference member for launder.
On Sun, Oct 30, 2016 at 2:53 PM, Ville Voutilainenwrote: > So, how about adding these? They should give us regression tests > that make sure launder does the right in in case some optimizations > attempt to reuse const/reference members. Good idea. You might put the reuse in a separate function in order to hide it from the optimizer. Jason
Re: relax rule for flexible array members in 6.x (78039 - fails to compile glibc tests)
OK. Jason
Re: [PATCH] Fix PR78035
On 2016.10.31 at 13:46 +0100, Richard Biener wrote: > > This is an updated patch for PR78035 which is about optimizing > address comparisons using points-to info vs. directly looking > at two addr-exprs. I've given up trying to dissect a predicate > more useful than decl_binds_to_current_def_p from > symtab_node::equal_address_to and thus I'm going with the former. It also fixes the kernel issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77964 Unfortunately Chromium still gets miscompiled: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78047 -- Markus
[PATCH] Fix PR78129
This fixes leftovers in /tmp from -Werror=suggest-final-types. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2016-10-31 Richard BienerPR lto/78129 * lto.c (do_whole_program_analysis): Bail out after errors from WPA analysis. Index: gcc/lto/lto.c === --- gcc/lto/lto.c (revision 241695) +++ gcc/lto/lto.c (working copy) @@ -3092,6 +3092,10 @@ do_whole_program_analysis (void) execute_ipa_pass_list (g->get_passes ()->all_regular_ipa_passes); + /* When WPA analysis raises errors, do not bother to output anything. */ + if (seen_error ()) +return; + if (symtab->dump_file) { fprintf (symtab->dump_file, "Optimized ");
[PATCH] Fix PR78035
This is an updated patch for PR78035 which is about optimizing address comparisons using points-to info vs. directly looking at two addr-exprs. I've given up trying to dissect a predicate more useful than decl_binds_to_current_def_p from symtab_node::equal_address_to and thus I'm going with the former. Bootstrap & regtest pending on x86_64-unknown-linux-gnu. Richard. 2016-10-31 Richard BienerPR tree-optimization/78035 * gimple-pretty-print.c (pp_points_to_solution): Print vars_contains_interposable. * tree-ssa-alias.c: Include varasm.h. (ptrs_compare_unequal): Check vars_contains_interposable and decl_binds_to_current_def_p. (dump_points_to_solution): Dump vars_contains_interposable. * tree-ssa-alias.h (struct pt_solution): Add vars_contains_interposable flag. * tree-ssa-structalias.c: Include varasm.h. (set_uids_in_ptset): Record whether vars contains a not decl_binds_to_current_def_p variable in vars_contains_interposable. (ipa_escaped_pt): Update initializer. * gcc.target/i386/pr78035.c: New testcase. Index: trunk/gcc/testsuite/gcc.target/i386/pr78035.c === --- trunk/gcc/testsuite/gcc.target/i386/pr78035.c (revision 0) +++ trunk/gcc/testsuite/gcc.target/i386/pr78035.c (working copy) @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +extern int a; +extern int b; +extern int c; + +int foo(int choose_a) +{ + int *p; + if (choose_a) +p = + else +p = + return p != +} + +int bar () +{ + return != +} + +/* We should not optimize away either comparison. */ +/* { dg-final { scan-assembler-times "cmp" 2 } } */ diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index 9d47f7f..8286326 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -728,6 +728,12 @@ pp_points_to_solution (pretty_printer *buffer, struct pt_solution *pt) { pp_string (buffer, comma); pp_string (buffer, "restrict"); + comma = ", "; + } + if (pt->vars_contains_interposable) + { + pp_string (buffer, comma); + pp_string (buffer, "interposable"); } pp_string (buffer, ")"); } diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c index 26c9f9e..ebae6cf 100644 --- a/gcc/tree-ssa-alias.c +++ b/gcc/tree-ssa-alias.c @@ -32,12 +32,12 @@ along with GCC; see the file COPYING3. If not see #include "tree-pretty-print.h" #include "alias.h" #include "fold-const.h" - #include "langhooks.h" #include "dumpfile.h" #include "tree-eh.h" #include "tree-dfa.h" #include "ipa-reference.h" +#include "varasm.h" /* Broad overview of how alias analysis on gimple works: @@ -373,14 +373,18 @@ ptrs_compare_unequal (tree ptr1, tree ptr2) /* We may not use restrict to optimize pointer comparisons. See PR71062. So we have to assume that restrict-pointed-to may be in fact obj1. */ - if (!pi || pi->pt.vars_contains_restrict) + if (!pi + || pi->pt.vars_contains_restrict + || pi->pt.vars_contains_interposable) return false; if (VAR_P (obj1) && (TREE_STATIC (obj1) || DECL_EXTERNAL (obj1))) { varpool_node *node = varpool_node::get (obj1); /* If obj1 may bind to NULL give up (see below). */ - if (! node || ! node->nonzero_address ()) + if (! node + || ! node->nonzero_address () + || ! decl_binds_to_current_def_p (obj1)) return false; } return !pt_solution_includes (>pt, obj1); @@ -553,7 +557,12 @@ dump_points_to_solution (FILE *file, struct pt_solution *pt) comma = ", "; } if (pt->vars_contains_restrict) - fprintf (file, "%srestrict", comma); + { + fprintf (file, "%srestrict", comma); + comma = ", "; + } + if (pt->vars_contains_interposable) + fprintf (file, "%sinterposable", comma); fprintf (file, ")"); } } diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h index 27a06fc..810d83c 100644 --- a/gcc/tree-ssa-alias.h +++ b/gcc/tree-ssa-alias.h @@ -57,6 +57,8 @@ struct GTY(()) pt_solution /* Nonzero if the vars bitmap includes a anonymous variable used to represent storage pointed to by a restrict qualified pointer. */ unsigned int vars_contains_restrict : 1; + /* Nonzero if the vars bitmap includes an interposable variable. */ + unsigned int vars_contains_interposable : 1; /* Set of variables that this pointer may point to. */ bitmap vars; diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c index fb364f1..10434f6 100644 --- a/gcc/tree-ssa-structalias.c +++ b/gcc/tree-ssa-structalias.c @@ -39,6 +39,8 @@
Re: [PATCH] Add -Wshadow-local and -Wshadow-compatible-local.
OK. On Sun, Oct 30, 2016 at 3:53 PM, Mark Wielaardwrote: > On Tue, Oct 25, 2016 at 01:57:09AM +0200, Mark Wielaard wrote: >> I think the only thing "blocking" the patch from going in is that >> nobody made a decission on how the actual warning option should be >> named. I think the suggestion for -Wshadow=[global|local|compatible-local] >> is the right one. With -Wshadow being an alias for -Wshadow=global. >> But since there are already gcc versions out there that accept >> -Wshadow-local and -Wshadow-compatible-local (and you can find some >> configure scripts that already check for those options) it would be >> good to have those as (hidden) aliases. >> >> If people, some maintainer, agrees with that then we can do the .opt >> file hacking to make it so. > > Nobody objected, nor did anybody say this is a great idea. But I think > it is. So I just implemented the options this way. > > I made one small diagnostic change to fix a regression pointed out by > gcc/testsuite/gcc.dg/pr48062.c. It should still be possible to ignore > all shadow warnings with #pragma GCC diagnostic ignored "-Wshadow" when > -Wshadow was given, but not the new -Wshadow=(local|compatible-local). > So we now set warning_code = OPT_Wshadow when -Wshadow was given. > > The documentation and code comments were updated to refer to the > new -Wshadow=... variants. > > OK to commit the attached patch? > > Thanks, > > Mark
Re: [PATCH] enhance buffer overflow warnings (and c/53562)
Martin Sebor wrote: > Attached is an updated patch that adds checks for excessive sizes > and bounds (those in excess of SIZE_MAX / 2), and also enables > the same checking for strcat and strncat). This version also > fixes an issue with the interpretation of anti-ranges in the > first patch. The improvements exposed two bugs in the regression > tests. If I apply this patch to my local trunk - and try to bootstrap GCC, bootstrapping fails (on x86-64_gnu-linux) as following. I have not tried to figure out whether the warning (-Werror) makes sense or not. ../../gcc/emit-rtl.c: In function ‘rtx_note* make_note_raw(insn_note)’: ../../gcc/emit-rtl.c:3933:59: error: void* memset(void*, int, size_t) writing 8 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow] memset (_DATA (note), 0, sizeof (NOTE_DATA (note))); where NOTE_DATA is defined in rtl.h as /* Opaque data. */ #define NOTE_DATA(INSN) RTL_CHECKC1 (INSN, 3, NOTE) Cheers, Tobias
Re: [C++ PATCH] Fix -std=c++11 -std=gnu++11 option handling (PR c++/77948)
On Sat, Oct 29, 2016 at 1:09 PM, Jakub Jelinekwrote: > On Sat, Oct 29, 2016 at 12:41:43PM -0400, Jason Merrill wrote: >> On Sat, Oct 29, 2016 at 12:09 PM, Jakub Jelinek wrote: >> > Apparently the driver reorders the options, -std=* options come before >> > all -f* options, so both of the following patches work the same, except >> > when cc1plus is invoked by hand. The first (inlined) patch solves it >> > more in line how e.g. -fgnu-keywords and similar options are handled, so I >> > think >> > it is preferable (and I've successfully bootstrapped/regtested it on >> > x86_64-linux and i686-linux). The second (attached) patch handles it >> > by not clearing -fext-numeric-literals during option processing >> > for -std=* options, just in post option handling clears it for the >> > strict c++11+ modes if the option is not specified explicitly. >> > If the driver wouldn't reorder the options, it would handle better >> > cases like -fext-numeric-literals -std=c++11 to mean the same thing >> > as -std=c++11 -fext-numeric-literals etc., but due to the reordering >> > it isn't needed and for consistency we'd also need to change -fgnu-keywords >> > etc. >> >> I lean toward the second patch because it keeps the logic in a single >> place, but it looks like it doesn't currently handle C++98 >> differently. I don't think we need to change the handling of other >> options to match. > > The second patch has that > + /* Unless -f{,no-}ext-numeric-literals has been used explicitly, > +for -std=c++{11,14,1z} default to -fno-ext-numeric-literals. */ > + if (flag_iso && !global_options_set.x_flag_ext_numeric_literals) > + cpp_opts->ext_numeric_literals = 0; > in a block guarded with cxx_dialect >= cxx11. So for -std=c++98 it will > just not disable them Ah, OK then. Jason
Re: [PATCH] Don't use priority {cd}tors if not supported by a target (PR, gcov-profile/78086)
On 10/31/2016 11:07 AM, Rainer Orth wrote: > Hi Martin, > >> Using priority {cd}tors on a target that does not support that can cause >> failures (see the PR). >> Apart from that, I decided to use priority 100 for both gcov_init and >> gcov_exit functions as >> the reserved range includes priority 100. Moreover, I enhanced test-cases >> we have. > > just two nits: > > diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C > b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C > new file mode 100644 > index 000..7e39565 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C > [...] > +static void __attribute__ ((constructor ((101 ctor_100 () > > Should be called ctor_101 now. Same for dtor_100 below. > > Rainer > Thanks for the note. Fixed in attached patch. Martin >From 0bd345d168bd18754bcb04a72e9717028d601a5f Mon Sep 17 00:00:00 2001 From: marxinDate: Wed, 26 Oct 2016 12:50:35 +0200 Subject: [PATCH] Don't use priority {cd}tors if not supported by a target (PR gcov-profile/78086) gcc/testsuite/ChangeLog: 2016-10-26 Martin Liska * g++.dg/gcov/pr16855.C: Clean up the test case. * g++.dg/gcov/pr16855-priority.C: New test. gcc/ChangeLog: 2016-10-26 Martin Liska * coverage.c (build_init_ctor): Don't use priority {cd}tors if not supported by a target. Set priority to 100 if possible. (build_gcov_exit_decl): Likewise. --- gcc/coverage.c | 13 +++-- gcc/testsuite/g++.dg/gcov/pr16855-priority.C | 79 gcc/testsuite/g++.dg/gcov/pr16855.C | 55 +-- 3 files changed, 116 insertions(+), 31 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gcov/pr16855-priority.C diff --git a/gcc/coverage.c b/gcc/coverage.c index 8810710..4167e26 100644 --- a/gcc/coverage.c +++ b/gcc/coverage.c @@ -1056,8 +1056,10 @@ build_init_ctor (tree gcov_info_type) stmt = build_call_expr (init_fn, 1, stmt); append_to_statement_list (stmt, ); - /* Generate a constructor to run it (with priority 99). */ - cgraph_build_static_cdtor ('I', ctor, DEFAULT_INIT_PRIORITY - 1); + /* Generate a constructor to run it. */ + int priority = SUPPORTS_INIT_PRIORITY +? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY; + cgraph_build_static_cdtor ('I', ctor, priority); } /* Generate the destructor function to call __gcov_exit. */ @@ -1078,8 +1080,11 @@ build_gcov_exit_decl (void) tree stmt = build_call_expr (init_fn, 0); append_to_statement_list (stmt, ); - /* Generate a destructor to run it (with priority 99). */ - cgraph_build_static_cdtor ('D', dtor, MAX_RESERVED_INIT_PRIORITY - 1); + /* Generate a destructor to run it. */ + int priority = SUPPORTS_INIT_PRIORITY +? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY; + + cgraph_build_static_cdtor ('D', dtor, priority); } /* Create the gcov_info types and object. Generate the constructor diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C new file mode 100644 index 000..5c1fbc4 --- /dev/null +++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C @@ -0,0 +1,79 @@ +/* { dg-options "-fprofile-arcs -ftest-coverage" } */ +/* { dg-do run { target native } } */ +/* { dg-require-effective-target init_priority } */ + +#include +#include + +int a; + +void +foo () +{ + fprintf (stderr, "In foo\n"); + a = 123; /* count(1) */ +} + +using namespace std; +class Test +{ +public: + Test (void) { fprintf (stderr, "In Test::Test\n"); /* count(1) */ } + ~Test (void) { fprintf (stderr, "In Test::~Test\n"); /* count(1) */ } +} T1; + +void +uncalled (void) +{ + fprintf (stderr, "In uncalled\n"); /* count(#) */ +} + +int +main (void) +{ + atexit (); + fprintf (stderr, "In main\n"); /* count(1) */ + return 0; +} + +static void __attribute__ ((constructor)) ctor_default () +{ + fprintf (stderr, "in constructor(())\n"); /* count(1) */ +} + +static void __attribute__ ((constructor ((101 ctor_101 () +{ + fprintf (stderr, "in constructor((101))\n"); /* count(1) */ +} + +static void __attribute__ ((constructor ((500 ctor_500 () +{ + fprintf (stderr, "in constructor((500))\n"); /* count(1) */ +} + +static void __attribute__ ((constructor ((65535 ctor_65535 () +{ + fprintf (stderr, "in constructor((65535))\n"); /* count(1) */ +} + +static void __attribute__ ((destructor)) dtor_default () +{ + fprintf (stderr, "in destructor(())\n"); /* count(1) */ +} + +static void __attribute__ ((destructor ((101 dtor_101 () +{ + fprintf (stderr, "in destructor((101))\n"); /* count(1) */ +} + +static void __attribute__ ((destructor ((500 dtor_500 () +{ + fprintf (stderr, "in destructor((500))\n"); /* count(1) */ +} + +static void __attribute__ ((destructor ((65535 dtor_65535 () +{ + fprintf (stderr, "in destructor((65535))\n"); /* count(1) */ +} + +/* { dg-final { run-gcov branches { -b pr16855-priority.C } } } */ diff --git
Re: [PATCH][AArch64] Add function comments to some prologue/epilogue helpers
Ping. Thanks, Kyrill On 24/10/16 12:30, Kyrill Tkachov wrote: Ping. https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00839.html Thanks, Kyrill On 12/10/16 11:23, Kyrill Tkachov wrote: Hi all, I'm looking at the aarch64 prologue and epilogue generation code and I noticed many of the helper functions don't have function comments so it makes it harder than it has to to understand what's going on. This patch adds function comments to some of them. I hope I understood the functions correctly. Is this ok for trunk? Thanks, Kyrill 2016-10-12 Kyrylo Tkachov* config/aarch64/aarch64.c (aarch64_register_saved_on_entry): Add function comment. (aarch64_next_callee_save): Likewise. (aarch64_pushwb_single_reg): Likewise. (aarch64_gen_storewb_pair): Likewise. (aarch64_push_regs): Likewise. (aarch64_gen_loadwb_pair): Likewise. (aarch64_pop_regs): Likewise. (aarch64_gen_store_pair): Likewise. (aarch64_gen_load_pair): Likewise. (aarch64_save_callee_saves): Likewise. (aarch64_restore_callee_saves): Likewise.
Re: [PATCH][AArch64] Fix PR target/77822: Use tighter predicates for zero_extract patterns
Ping. Thanks, Kyrill On 24/10/16 14:12, Kyrill Tkachov wrote: On 24/10/16 12:29, Kyrill Tkachov wrote: Ping. https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01321.html I just noticed my original ChangeLog entry was truncated. It is 2016-10-04 Kyrylo TkachovPR target/77822 * config/aarch64/aarch64.md (*tb1): Use aarch64_simd_shift_imm_ predicate for operand 1. (, ANY_EXTRACT): Use tighter predicates on operands 2 and 3 to restrict them to an appropriate range and add FAIL check if the region they specify is out of range. Delete useless constraint strings. (*, ANY_EXTRACT): Add appropriate predicates on operands 2 and 3 to restrict their range and add pattern predicate. 2016-10-04 Kyrylo Tkachov PR target/77822 * g++.dg/torture/pr77822.C: New test. Kyrill On 17/10/16 17:15, Kyrill Tkachov wrote: Hi all, For the attached testcase the code ends up trying to extract bits outside the range of the normal register widths. The aarch64 patterns for ubfz and tbnz end up accepting such operands and emitting invalid assembly such as 'ubfx x18,x2,192,32' The solution is to add proper predicates and guards to the operands of the zero_extract operations that are going on. I had a look at all the other patterns in aarch64 that generate/use zero_extract and they all have guards on their operands in one form or another to avoid them accessing an area that is out of range. With this patch the testcase compiles and assembles fine. Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2016-10-17 Kyrylo Tkachov PR target/77822 * config/aarch64/aarch64.md (*tb1): Use aarch64_simd_shift_imm_ predicate for operand 1. (, ANY_EXTRACT): Use tighter predicates on operands 2 and 3 to restrict them to an appropriate range and add FAIL check if the region they specify is out of range. Delete useless constraint strings. (*, ANY_EXTRACT): Add appropriate predicates on operands 2 and 3 to restrict their range and add pattern predicate. 2016-10-17 Kyrylo Tkachov PR target/77822
Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable
On 24/10/16 17:15, Andrew Pinski wrote: On Mon, Oct 24, 2016 at 7:27 AM, Kyrill Tkachovwrote: Hi all, When storing a 64-bit immediate that has equal bottom and top halves we currently synthesize the repeating 32-bit pattern twice and perform a single X-store. With this patch we synthesize the 32-bit pattern once into a W register and store that twice using an STP. This reduces codesize bloat from synthesising the same constant multiple times at the expense of converting a store to a store-pair. It will only trigger if we can save two or more instructions, so it will only transform: mov x1, 49370 movkx1, 0xc0da, lsl 32 str x1, [x0] into: mov w1, 49370 stp w1, w1, [x0] when optimising for -Os, whereas it will always transform a 4-insn synthesis sequence into a two-insn sequence + STP (see comments in the patch). This patch triggers already but will trigger more with the store merging pass that I'm working on since that will generate more of these repeating 64-bit constants. This helps improve codegen on 456.hmmer where store merging can sometimes create very complex repeating constants and target-specific expand needs to break them down. Doing STP might be worse on ThunderX 1 than the mov/movk. Or this might cause an ICE with -mcpu=thunderx; I can't remember if the check for slow unaligned store pair word is with the pattern or not. I can't get it to ICE with -mcpu=thunderx. The restriction is just on the STP forming code in the sched-fusion hooks AFAIK. Basically the rule is 1) if 4 byte aligned, then it is better to do two str. 2) If 8 byte aligned, then doing stp is good 3) Otherwise it is better to do two str. Ok, then I'll make the function just emit two stores and depend on the sched-fusion machinery to fuse them into an STP when appropriate since that has the logic that takes thunderx into account. Thanks for the info. Kyrill Thanks, Andrew Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2016-10-24 Kyrylo Tkachov * config/aarch64/aarch64.md (mov): Call aarch64_split_dimode_const_store on DImode constant stores. * config/aarch64/aarch64-protos.h (aarch64_split_dimode_const_store): New prototype. * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New function. 2016-10-24 Kyrylo Tkachov * gcc.target/aarch64/store_repeating_constant_1.c: New test. * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.
Re: [PATCH 3/5] [AARCH64] Fix part num and implement dependency
On Sun, Oct 23, 2016 at 03:37:22PM -0700, Andrew Pinski wrote: > On Tue, Nov 17, 2015 at 2:10 PM, Andrew Pinskiwrote: > > > > The way the current code was written assumes all cores have an unique part > > num which is not true. What they have is an unique pair of implementer and > > part num. This changes the code to look up that pair after the parsing > > of the two is done. > > > > Someone should test this on a big.little target too just to make sure > > the parsing is done correctly as I don't have access to one off hand. > > > > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. > > > Here is an updated version of this patch after the other patches have > now gone in. > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions > and -mcpu=native still works. OK. The logic in this file looks a bit upside down in places. I might have a go at refactoring it at some point in stage 3 or 4, but that's no reason to block this patch for now. Thanks, James > Index: config/aarch64/driver-aarch64.c > === > --- config/aarch64/driver-aarch64.c (revision 241437) > +++ config/aarch64/driver-aarch64.c (working copy) > @@ -169,7 +169,6 @@ host_detect_local_cpu (int argc, const c >bool tune = false; >bool cpu = false; >unsigned int i = 0; > - unsigned int core_idx = 0; >unsigned char imp = INVALID_IMP; >unsigned int cores[2] = { INVALID_CORE, INVALID_CORE }; >unsigned int n_cores = 0; > @@ -219,18 +218,13 @@ host_detect_local_cpu (int argc, const c >if (strstr (buf, "part") != NULL) > { > unsigned ccore = parse_field (buf); > - for (i = 0; aarch64_cpu_data[i].name != NULL; i++) > - if (ccore == aarch64_cpu_data[i].part_no > - && !contains_core_p (cores, ccore)) > - { > - if (n_cores == 2) > - goto not_found; > - > - cores[n_cores++] = ccore; > - core_idx = i; > - arch_id = aarch64_cpu_data[i].arch; > - break; > - } > + if (!contains_core_p (cores, ccore)) > + { > + if (n_cores == 2) > + goto not_found; > + > + cores[n_cores++] = ccore; > + } > continue; > } >if (!tune && !processed_exts && strstr (buf, "Features") != NULL) > @@ -276,11 +270,19 @@ host_detect_local_cpu (int argc, const c >if (n_cores == 0 || n_cores > 2 || imp == INVALID_IMP) > goto not_found; > > - if (arch && !arch_id) > -goto not_found; > - >if (arch) > { > + /* Search for one of the cores in the list. */ > + for (i = 0; aarch64_cpu_data[i].name != NULL; i++) > + if (aarch64_cpu_data[i].implementer_id == imp > + && contains_core_p (cores, aarch64_cpu_data[i].part_no)) > + { > + arch_id = aarch64_cpu_data[i].arch; > + break; > + } > + if (!arch_id) > + goto not_found; > + >struct aarch64_arch_driver_info* arch_info = get_arch_from_id > (arch_id); > >/* We got some arch indentifier that's not in aarch64-arches.def? */ > @@ -312,7 +314,15 @@ host_detect_local_cpu (int argc, const c >/* The simple, non-big.LITTLE case. */ >else > { > - if (aarch64_cpu_data[core_idx].implementer_id != imp) > + int core_idx = -1; > + for (i = 0; aarch64_cpu_data[i].name != NULL; i++) > + if (cores[0] == aarch64_cpu_data[i].part_no > + && aarch64_cpu_data[i].implementer_id == imp) > + { > + core_idx = i; > + break; > + } > + if (core_idx == -1) > goto not_found; > >res = concat ("-m", cpu ? "cpu" : "tune", "=",
Re: [PATCH] Fix host_size_t_cst_p predicate
On Mon, Oct 31, 2016 at 11:18 AM, Richard Sandifordwrote: > Richard Biener writes: >> On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandiford >> wrote: >>> Richard Biener writes: On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška wrote: > On 10/31/2016 01:12 AM, Richard Sandiford wrote: >> Richard Biener writes: >>> On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška wrote: On 10/27/2016 03:35 PM, Richard Biener wrote: > On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška wrote: >> Running simple test-case w/o the proper header file causes ICE: >> strncmp ("a", "b", -1); >> >> 0xe74462 tree_to_uhwi(tree_node const*) >> ../../gcc/tree.c:7324 >> 0x90a23f host_size_t_cst_p >> ../../gcc/fold-const-call.c:63 >> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*, >> tree_node*, tree_node*) >> ../../gcc/fold-const-call.c:1512 >> 0x787b01 fold_builtin_3 >> ../../gcc/builtins.c:8385 >> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, >> int, bool) >> ../../gcc/builtins.c:8465 >> 0x9052b1 fold(tree_node*) >> ../../gcc/fold-const.c:11919 >> 0x6de2bb c_fully_fold_internal >> ../../gcc/c/c-fold.c:185 >> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*) >> ../../gcc/c/c-fold.c:90 >> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*) >> ../../gcc/c/c-typeck.c:10369 >> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*) >> ../../gcc/c/c-typeck.c:10414 >> 0x6cb578 c_parser_statement_after_labels >> ../../gcc/c/c-parser.c:5430 >> 0x6cd333 c_parser_compound_statement_nostart >> ../../gcc/c/c-parser.c:4944 >> 0x6cdbde c_parser_compound_statement >> ../../gcc/c/c-parser.c:4777 >> 0x6c93ac c_parser_declaration_or_fndef >> ../../gcc/c/c-parser.c:2176 >> 0x6d51ab c_parser_external_declaration >> ../../gcc/c/c-parser.c:1574 >> 0x6d5c09 c_parser_translation_unit >> ../../gcc/c/c-parser.c:1454 >> 0x6d5c09 c_parse_file() >> ../../gcc/c/c-parser.c:18173 >> 0x72ffd2 c_common_parse_file() >> ../../gcc/c-family/c-opts.c:1087 >> >> Following patch improves the host_size_t_cst_p predicate. >> >> Patch can bootstrap on ppc64le-redhat-linux and survives >> regression tests. >> >> Ready to be installed? > > I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * > CHAR_BIT test is now redundant. > > OTOH it was probably desired to allow -1 here? A little looking back > in time should tell. Ok, it started with r229922, where it was changed from: if (tree_fits_uhwi_p (len) && p1 && p2) { const int i = strncmp (p1, p2, tree_to_uhwi (len)); ... to current version: case CFN_BUILT_IN_STRNCMP: { bool const_size_p = host_size_t_cst_p (arg2, ); Thus I'm suggesting to change to back to it. Ready to be installed? >>> >>> Let's ask Richard. >> >> The idea with the: >> >> wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT >> >> test was to stop us attempting 64-bit size_t operations on ILP32 hosts. >> I think we still want that. > > OK, so is the consensus to add tree_fits_uhwi_p predicate to the current > wi::min_precision check, right? Not sure. If we have host_size_t_cst_p then we should have a corresponding size_t host_size_t (const_tree) and should use those in pairs. Not sure why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p. >>> >>> It's the other way around: something can satisfy tree_fits_uhwi_p >>> (i.e. fit within a uint64_t) but not fit within the host's size_t. >>> The kind of case I'm thinking of is: >>> >>> strncmp ("fi", "fo", (1L << 32) + 1) >>> >>> for an ILP32 host and LP64 target. There's a danger that by passing >>> the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd >>> truncate it to 1, giving the wrong result. >> >> Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then? >> (unless we have a > 64bit host size_t). > > Because in Martin's test case the length has a signed type. > tree_to_uhwi then treats the argument as -1 to infinite precision > rather than ~(size_t) 0 to infinite precision. Indeed. So the
Re: [PATCH] Fix host_size_t_cst_p predicate
Richard Bienerwrites: > On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandiford > wrote: >> Richard Biener writes: >>> On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška wrote: On 10/31/2016 01:12 AM, Richard Sandiford wrote: > Richard Biener writes: >> On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška wrote: >>> On 10/27/2016 03:35 PM, Richard Biener wrote: On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška wrote: > Running simple test-case w/o the proper header file causes ICE: > strncmp ("a", "b", -1); > > 0xe74462 tree_to_uhwi(tree_node const*) > ../../gcc/tree.c:7324 > 0x90a23f host_size_t_cst_p > ../../gcc/fold-const-call.c:63 > 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*, > tree_node*, tree_node*) > ../../gcc/fold-const-call.c:1512 > 0x787b01 fold_builtin_3 > ../../gcc/builtins.c:8385 > 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, > int, bool) > ../../gcc/builtins.c:8465 > 0x9052b1 fold(tree_node*) > ../../gcc/fold-const.c:11919 > 0x6de2bb c_fully_fold_internal > ../../gcc/c/c-fold.c:185 > 0x6e1f6b c_fully_fold(tree_node*, bool, bool*) > ../../gcc/c/c-fold.c:90 > 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*) > ../../gcc/c/c-typeck.c:10369 > 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*) > ../../gcc/c/c-typeck.c:10414 > 0x6cb578 c_parser_statement_after_labels > ../../gcc/c/c-parser.c:5430 > 0x6cd333 c_parser_compound_statement_nostart > ../../gcc/c/c-parser.c:4944 > 0x6cdbde c_parser_compound_statement > ../../gcc/c/c-parser.c:4777 > 0x6c93ac c_parser_declaration_or_fndef > ../../gcc/c/c-parser.c:2176 > 0x6d51ab c_parser_external_declaration > ../../gcc/c/c-parser.c:1574 > 0x6d5c09 c_parser_translation_unit > ../../gcc/c/c-parser.c:1454 > 0x6d5c09 c_parse_file() > ../../gcc/c/c-parser.c:18173 > 0x72ffd2 c_common_parse_file() > ../../gcc/c-family/c-opts.c:1087 > > Following patch improves the host_size_t_cst_p predicate. > > Patch can bootstrap on ppc64le-redhat-linux and survives > regression tests. > > Ready to be installed? I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT test is now redundant. OTOH it was probably desired to allow -1 here? A little looking back in time should tell. >>> >>> Ok, it started with r229922, where it was changed from: >>> >>> if (tree_fits_uhwi_p (len) && p1 && p2) >>> { >>> const int i = strncmp (p1, p2, tree_to_uhwi (len)); >>> ... >>> >>> to current version: >>> >>> case CFN_BUILT_IN_STRNCMP: >>> { >>> bool const_size_p = host_size_t_cst_p (arg2, ); >>> >>> Thus I'm suggesting to change to back to it. >>> >>> Ready to be installed? >> >> Let's ask Richard. > > The idea with the: > > wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT > > test was to stop us attempting 64-bit size_t operations on ILP32 hosts. > I think we still want that. OK, so is the consensus to add tree_fits_uhwi_p predicate to the current wi::min_precision check, right? >>> >>> Not sure. If we have host_size_t_cst_p then we should have a corresponding >>> size_t host_size_t (const_tree) and should use those in pairs. Not sure >>> why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p. >> >> It's the other way around: something can satisfy tree_fits_uhwi_p >> (i.e. fit within a uint64_t) but not fit within the host's size_t. >> The kind of case I'm thinking of is: >> >> strncmp ("fi", "fo", (1L << 32) + 1) >> >> for an ILP32 host and LP64 target. There's a danger that by passing >> the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd >> truncate it to 1, giving the wrong result. > > Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then? > (unless we have a > 64bit host size_t). Because in Martin's test case the length has a signed type. tree_to_uhwi then treats the argument as -1 to infinite precision rather than ~(size_t) 0 to infinite precision. Thanks, Richard
Re: [PATCH] Fix host_size_t_cst_p predicate
On Mon, Oct 31, 2016 at 10:58 AM, Richard Sandifordwrote: > Richard Biener writes: >> On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška wrote: >>> On 10/31/2016 01:12 AM, Richard Sandiford wrote: Richard Biener writes: > On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška wrote: >> On 10/27/2016 03:35 PM, Richard Biener wrote: >>> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška wrote: Running simple test-case w/o the proper header file causes ICE: strncmp ("a", "b", -1); 0xe74462 tree_to_uhwi(tree_node const*) ../../gcc/tree.c:7324 0x90a23f host_size_t_cst_p ../../gcc/fold-const-call.c:63 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*, tree_node*, tree_node*) ../../gcc/fold-const-call.c:1512 0x787b01 fold_builtin_3 ../../gcc/builtins.c:8385 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, int, bool) ../../gcc/builtins.c:8465 0x9052b1 fold(tree_node*) ../../gcc/fold-const.c:11919 0x6de2bb c_fully_fold_internal ../../gcc/c/c-fold.c:185 0x6e1f6b c_fully_fold(tree_node*, bool, bool*) ../../gcc/c/c-fold.c:90 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*) ../../gcc/c/c-typeck.c:10369 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*) ../../gcc/c/c-typeck.c:10414 0x6cb578 c_parser_statement_after_labels ../../gcc/c/c-parser.c:5430 0x6cd333 c_parser_compound_statement_nostart ../../gcc/c/c-parser.c:4944 0x6cdbde c_parser_compound_statement ../../gcc/c/c-parser.c:4777 0x6c93ac c_parser_declaration_or_fndef ../../gcc/c/c-parser.c:2176 0x6d51ab c_parser_external_declaration ../../gcc/c/c-parser.c:1574 0x6d5c09 c_parser_translation_unit ../../gcc/c/c-parser.c:1454 0x6d5c09 c_parse_file() ../../gcc/c/c-parser.c:18173 0x72ffd2 c_common_parse_file() ../../gcc/c-family/c-opts.c:1087 Following patch improves the host_size_t_cst_p predicate. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? >>> >>> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * >>> CHAR_BIT test is now redundant. >>> >>> OTOH it was probably desired to allow -1 here? A little looking back >>> in time should tell. >> >> Ok, it started with r229922, where it was changed from: >> >> if (tree_fits_uhwi_p (len) && p1 && p2) >> { >> const int i = strncmp (p1, p2, tree_to_uhwi (len)); >> ... >> >> to current version: >> >> case CFN_BUILT_IN_STRNCMP: >> { >> bool const_size_p = host_size_t_cst_p (arg2, ); >> >> Thus I'm suggesting to change to back to it. >> >> Ready to be installed? > > Let's ask Richard. The idea with the: wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT test was to stop us attempting 64-bit size_t operations on ILP32 hosts. I think we still want that. >>> >>> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current >>> wi::min_precision check, right? >> >> Not sure. If we have host_size_t_cst_p then we should have a corresponding >> size_t host_size_t (const_tree) and should use those in pairs. Not sure >> why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p. > > It's the other way around: something can satisfy tree_fits_uhwi_p > (i.e. fit within a uint64_t) but not fit within the host's size_t. > The kind of case I'm thinking of is: > > strncmp ("fi", "fo", (1L << 32) + 1) > > for an ILP32 host and LP64 target. There's a danger that by passing > the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd > truncate it to 1, giving the wrong result. Yes, but if it passes host_size_t_cst_p why does tree_to_uhwi ICE then? (unless we have a > 64bit host size_t). Richard. > > Thanks, > Richard
Re: [PATCH] Don't use priority {cd}tors if not supported by a target (PR, gcov-profile/78086)
Hi Martin, > Using priority {cd}tors on a target that does not support that can cause > failures (see the PR). > Apart from that, I decided to use priority 100 for both gcov_init and > gcov_exit functions as > the reserved range includes priority 100. Moreover, I enhanced test-cases > we have. just two nits: diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C new file mode 100644 index 000..7e39565 --- /dev/null +++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C [...] +static void __attribute__ ((constructor ((101 ctor_100 () Should be called ctor_101 now. Same for dtor_100 below. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Fix computation of register limit for -fsched-pressure
Yeah, In the test it's reasonable, just used one extra register. I'm currently running some benchmarks to make sure those are OK too. Regards, Tamar From: Bin.ChengSent: Friday, October 28, 2016 12:38:28 PM To: Tamar Christina Cc: Pat Haugen; Maxim Kuvyrkov; GCC Patches; nd Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure On Fri, Oct 28, 2016 at 12:27 PM, Tamar Christina wrote: > Looking at it again, > > it seems to be that the testcase should be adjusted. > There's no actual spilling. It just uses more registers than before due to > the scheduling. Sorry I didn't look into the test, but using more registers sounds like a regression too? At least we need to make sure it's reasonable. Thanks, bin > > I will update the testcase. > > Thanks. > > > From: gcc-patches-ow...@gcc.gnu.org on behalf > of Tamar Christina > Sent: Friday, October 28, 2016 10:53:20 AM > To: Pat Haugen; Maxim Kuvyrkov > Cc: GCC Patches; nd > Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure > > Forwarding to list as well. > > From: Tamar Christina > Sent: Friday, October 28, 2016 10:52:17 AM > To: Pat Haugen; Maxim Kuvyrkov > Cc: GCC Patches > Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure > > Hi Pat, > > The commit seems to be causing some odd stack spills on aarch64. > > I've created a new ticket https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78142 > > Thanks, > Tamar > > > From: gcc-patches-ow...@gcc.gnu.org on behalf > of Pat Haugen > Sent: Tuesday, October 18, 2016 4:07:55 PM > To: Maxim Kuvyrkov > Cc: GCC Patches > Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure > > On 10/18/2016 05:31 AM, Maxim Kuvyrkov wrote: >>> > I see your point and agree that current code isn't optimal. However, I >>> > don't think your patch is accurate either. Consider >>> > https://gcc.gnu.org/onlinedocs/gccint/Register-Basics.html and let's >>> > assume that FIXED_REGISTERS in class CL is set for a third of the >>> > registers, and CALL_USED_REGISTERS is set to "1" for another third of >>> > registers. So we have a third available for zero-cost allocation >>> > (CALL_USED_REGISTERS-FIXED_REGISTERS), a third available for spill-cost >>> > allocation (ALL_REGISTERS-CALL_USED_REGISTERS) and a third non-available >>> > (FIXED_REGISTERS). >>> > >>> > For a non-loop-single-basic-block function we should be targeting only >>> > the third of register available at zero-cost -- correct? > Yes. > > This is what is done by the current code, but, apparently, by accident. It > seems that the right register count can be obtained with: >>> > >>> > for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i) >>> > - if (call_used_regs[ira_class_hard_regs[cl][i]]) >>> > -++call_used_regs_num[cl]; >>> > + if (!call_used_regs[ira_class_hard_regs[cl][i]] >>> > + || fixed_regs[ira_class_hard_regs[cl][i]]) >>> > +++call_saved_regs_num[cl]; >>> > >>> > Does this look correct to you? >> Thinking some more, it seems like fixed_regs should not be available to the >> scheduler no matter what. Therefore, the number of fixed registers should >> be subtracted from ira_class_hard_regs_num[cl] without any scaling >> (entry_freq / bb_freq). > > Ahh, yes, I forgot about FIXED_REGISTERS being included in > CALL_USED_REGISTERS. I agree they should be totally removed from the register > limit calculation. I'll rework the patch. > > Thanks, > Pat >
Re: [PATCH] Fix host_size_t_cst_p predicate
Richard Bienerwrites: > On Mon, Oct 31, 2016 at 10:10 AM, Martin Liška wrote: >> On 10/31/2016 01:12 AM, Richard Sandiford wrote: >>> Richard Biener writes: On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška wrote: > On 10/27/2016 03:35 PM, Richard Biener wrote: >> On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška wrote: >>> Running simple test-case w/o the proper header file causes ICE: >>> strncmp ("a", "b", -1); >>> >>> 0xe74462 tree_to_uhwi(tree_node const*) >>> ../../gcc/tree.c:7324 >>> 0x90a23f host_size_t_cst_p >>> ../../gcc/fold-const-call.c:63 >>> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*, >>> tree_node*, tree_node*) >>> ../../gcc/fold-const-call.c:1512 >>> 0x787b01 fold_builtin_3 >>> ../../gcc/builtins.c:8385 >>> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, int, >>> bool) >>> ../../gcc/builtins.c:8465 >>> 0x9052b1 fold(tree_node*) >>> ../../gcc/fold-const.c:11919 >>> 0x6de2bb c_fully_fold_internal >>> ../../gcc/c/c-fold.c:185 >>> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*) >>> ../../gcc/c/c-fold.c:90 >>> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*) >>> ../../gcc/c/c-typeck.c:10369 >>> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*) >>> ../../gcc/c/c-typeck.c:10414 >>> 0x6cb578 c_parser_statement_after_labels >>> ../../gcc/c/c-parser.c:5430 >>> 0x6cd333 c_parser_compound_statement_nostart >>> ../../gcc/c/c-parser.c:4944 >>> 0x6cdbde c_parser_compound_statement >>> ../../gcc/c/c-parser.c:4777 >>> 0x6c93ac c_parser_declaration_or_fndef >>> ../../gcc/c/c-parser.c:2176 >>> 0x6d51ab c_parser_external_declaration >>> ../../gcc/c/c-parser.c:1574 >>> 0x6d5c09 c_parser_translation_unit >>> ../../gcc/c/c-parser.c:1454 >>> 0x6d5c09 c_parse_file() >>> ../../gcc/c/c-parser.c:18173 >>> 0x72ffd2 c_common_parse_file() >>> ../../gcc/c-family/c-opts.c:1087 >>> >>> Following patch improves the host_size_t_cst_p predicate. >>> >>> Patch can bootstrap on ppc64le-redhat-linux and survives >>> regression tests. >>> >>> Ready to be installed? >> >> I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * >> CHAR_BIT test is now redundant. >> >> OTOH it was probably desired to allow -1 here? A little looking back >> in time should tell. > > Ok, it started with r229922, where it was changed from: > > if (tree_fits_uhwi_p (len) && p1 && p2) > { > const int i = strncmp (p1, p2, tree_to_uhwi (len)); > ... > > to current version: > > case CFN_BUILT_IN_STRNCMP: > { > bool const_size_p = host_size_t_cst_p (arg2, ); > > Thus I'm suggesting to change to back to it. > > Ready to be installed? Let's ask Richard. >>> >>> The idea with the: >>> >>> wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT >>> >>> test was to stop us attempting 64-bit size_t operations on ILP32 hosts. >>> I think we still want that. >> >> OK, so is the consensus to add tree_fits_uhwi_p predicate to the current >> wi::min_precision check, right? > > Not sure. If we have host_size_t_cst_p then we should have a corresponding > size_t host_size_t (const_tree) and should use those in pairs. Not sure > why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p. It's the other way around: something can satisfy tree_fits_uhwi_p (i.e. fit within a uint64_t) but not fit within the host's size_t. The kind of case I'm thinking of is: strncmp ("fi", "fo", (1L << 32) + 1) for an ILP32 host and LP64 target. There's a danger that by passing the uint64_t value (1L << 32) + 1 to the host's strncmp that we'd truncate it to 1, giving the wrong result. Thanks, Richard
[PATCH] libiberty: Fix memory leak in ada_demangle when symbol cannot be demangled.
When a symbol cannot be demangled in ada_demangle a new demangled VEC will be allocated without deleting the demangled VEC already in use. Running testsuite/test-demangle under valgrind will show the leak for this entry in testsuite/demangle-expected: # Elaborated flag (not demangled) --format=gnat x_E 11 bytes in 1 blocks are definitely lost in loss record 1 of 1 at 0x4C27BE3: malloc (vg_replace_malloc.c:299) by 0x413FE7: xmalloc (xmalloc.c:148) by 0x4025EC: ada_demangle (cplus-dem.c:930) by 0x402C59: cplus_demangle (cplus-dem.c:892) by 0x400FEC: main (test-demangle.c:317) libiberty/ChangeLog: * cplus-dem.c (ada_demangle): Initialize demangled to NULL and XDELETEVEC demangled when unknown. --- libiberty/ChangeLog | 5 + libiberty/cplus-dem.c | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog index 5934bc1..73cbcc4 100644 --- a/libiberty/ChangeLog +++ b/libiberty/ChangeLog @@ -1,3 +1,8 @@ +2016-10-31 Mark Wielaard+ + * cplus-dem.c (ada_demangle): Initialize demangled to NULL and + XDELETEVEC demangled when unknown. + 2016-09-19 Andrew Stubbs * pex-win32.c (argv_to_cmdline): Quote zero-length parameters. diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c index f954050..7f63397 100644 --- a/libiberty/cplus-dem.c +++ b/libiberty/cplus-dem.c @@ -911,7 +911,7 @@ ada_demangle (const char *mangled, int option ATTRIBUTE_UNUSED) int len0; const char* p; char *d; - char *demangled; + char *demangled = NULL; /* Discard leading _ada_, which is used for library level subprograms. */ if (strncmp (mangled, "_ada_", 5) == 0) @@ -1156,6 +1156,7 @@ ada_demangle (const char *mangled, int option ATTRIBUTE_UNUSED) return demangled; unknown: + XDELETEVEC (demangled); len0 = strlen (mangled); demangled = XNEWVEC (char, len0 + 3); -- 1.8.3.1
Re: [PATCH] Fix and testcases for pr72747
On Fri, Oct 28, 2016 at 4:06 PM, Will Schmidtwrote: > On Fri, 2016-10-28 at 08:31 -0500, Will Schmidt wrote: >> On Fri, 2016-10-28 at 10:38 +0200, Richard Biener wrote: >> > On Thu, Oct 27, 2016 at 5:37 PM, Will Schmidt >> > wrote: >> > > Hi, >> > > >> > > Per PR72747, A statement such as "v = vec_splats (1);" correctly >> > > initializes a vector. However, a statement such as "v[1] = v[0] = >> > > vec_splats (1);" initializes both v[1] and v[0] to random garbage. >> > > >> > > It has been determined that this is occurring because we did not emit >> > > the actual initialization statement before our final exit from >> > > gimplify_init_constructor, at which time we lose the expression when we >> > > assign *expr_p to either NULL or object. This problem affected both >> > > constant >> > > and non-constant initializers. Corrected this by moving the logic to >> > > emit the statement up earlier within the if/else logic. >> > > >> > > Bootstrapped and make check ran without regressions on >> > > powerpc64le-unknown-linux-gnu. >> > > >> > > OK for trunk? >> > >> > Ok. >> > >> > RIchard. >> >> Committed revision 241647. >> Thanks. > > Oops, forgot to ask. After some burn-in time, is this OK to backport to > the 5 and 6 branches? Yes. Richard. > The bug was first noticed in GCC 5. > > Thanks, > -Will > >> >> >> > >> > > Thanks, >> > > -Will >> > > >> > > gcc: >> > > 2016-10-26 Will Schmidt >> > > >> > > PR middle-end/72747 >> > > * gimplify.c (gimplify_init_constructor): Move emit of >> > > constructor >> > > assignment to earlier in the if/else logic. >> > > >> > > testsuite: >> > > 2016-10-26 Will Schmidt >> > > >> > > PR middle-end/72747 >> > > * c-c++-common/pr72747-1.c: New test. >> > > * c-c++-common/pr72747-2.c: Likewise. >> > > >> > > Index: gcc/gimplify.c >> > > === >> > > --- gcc/gimplify.c (revision 241600) >> > > +++ gcc/gimplify.c (working copy) >> > > @@ -4730,24 +4730,23 @@ >> > > >> > >if (ret == GS_ERROR) >> > > return GS_ERROR; >> > > - else if (want_value) >> > > + /* If we have gimplified both sides of the initializer but have >> > > + not emitted an assignment, do so now. */ >> > > + if (*expr_p) >> > > { >> > > + tree lhs = TREE_OPERAND (*expr_p, 0); >> > > + tree rhs = TREE_OPERAND (*expr_p, 1); >> > > + gassign *init = gimple_build_assign (lhs, rhs); >> > > + gimplify_seq_add_stmt (pre_p, init); >> > > +} >> > > + if (want_value) >> > > +{ >> > >*expr_p = object; >> > >return GS_OK; >> > > } >> > >else >> > > { >> > > - /* If we have gimplified both sides of the initializer but have >> > > -not emitted an assignment, do so now. */ >> > > - if (*expr_p) >> > > - { >> > > - tree lhs = TREE_OPERAND (*expr_p, 0); >> > > - tree rhs = TREE_OPERAND (*expr_p, 1); >> > > - gassign *init = gimple_build_assign (lhs, rhs); >> > > - gimplify_seq_add_stmt (pre_p, init); >> > > - *expr_p = NULL; >> > > - } >> > > - >> > > + *expr_p = NULL; >> > >return GS_ALL_DONE; >> > > } >> > > } >> > > Index: gcc/testsuite/c-c++-common/pr72747-1.c >> > > === >> > > --- gcc/testsuite/c-c++-common/pr72747-1.c (revision 0) >> > > +++ gcc/testsuite/c-c++-common/pr72747-1.c (working copy) >> > > @@ -0,0 +1,16 @@ >> > > +/* { dg-do compile } */ >> > > +/* { dg-require-effective-target powerpc_altivec_ok } */ >> > > +/* { dg-options "-maltivec -fdump-tree-gimple" } */ >> > > + >> > > +/* PR 72747: Test that cascaded definition is happening for constant >> > > vectors. */ >> > > + >> > > +#include >> > > + >> > > +int main (int argc, char *argv[]) >> > > +{ >> > > + __vector int v1,v2; >> > > + v1 = v2 = vec_splats ((int) 42); >> > > + return 0; >> > > +} >> > > +/* { dg-final { scan-tree-dump-times " v2 = { 42, 42, 42, 42 }" 1 >> > > "gimple" } } */ >> > > + >> > > Index: gcc/testsuite/c-c++-common/pr72747-2.c >> > > === >> > > --- gcc/testsuite/c-c++-common/pr72747-2.c (revision 0) >> > > +++ gcc/testsuite/c-c++-common/pr72747-2.c (working copy) >> > > @@ -0,0 +1,18 @@ >> > > +/* { dg-do compile } */ >> > > +/* { dg-require-effective-target powerpc_altivec_ok } */ >> > > +/* { dg-options "-c -maltivec -fdump-tree-gimple" } */ >> > > + >> > > +/* PR 72747: test that cascaded definition is happening for non >> > > constants. */ >> > > + >> > > +void foo () >> > > +{ >> > > + extern int i; >> > > + __vector int v,w; >> > > +v = w = (vector int) { i }; >> > > +} >> > > + >> > > +int main (int argc, char *argv[]) >> > > +{ >> > > + return 0;
Re: [PATCH] Fix host_size_t_cst_p predicate
On Mon, Oct 31, 2016 at 10:10 AM, Martin Liškawrote: > On 10/31/2016 01:12 AM, Richard Sandiford wrote: >> Richard Biener writes: >>> On Thu, Oct 27, 2016 at 5:06 PM, Martin Liška wrote: On 10/27/2016 03:35 PM, Richard Biener wrote: > On Thu, Oct 27, 2016 at 9:41 AM, Martin Liška wrote: >> Running simple test-case w/o the proper header file causes ICE: >> strncmp ("a", "b", -1); >> >> 0xe74462 tree_to_uhwi(tree_node const*) >> ../../gcc/tree.c:7324 >> 0x90a23f host_size_t_cst_p >> ../../gcc/fold-const-call.c:63 >> 0x90a23f fold_const_call(combined_fn, tree_node*, tree_node*, >> tree_node*, tree_node*) >> ../../gcc/fold-const-call.c:1512 >> 0x787b01 fold_builtin_3 >> ../../gcc/builtins.c:8385 >> 0x787b01 fold_builtin_n(unsigned int, tree_node*, tree_node**, int, bool) >> ../../gcc/builtins.c:8465 >> 0x9052b1 fold(tree_node*) >> ../../gcc/fold-const.c:11919 >> 0x6de2bb c_fully_fold_internal >> ../../gcc/c/c-fold.c:185 >> 0x6e1f6b c_fully_fold(tree_node*, bool, bool*) >> ../../gcc/c/c-fold.c:90 >> 0x67cbbf c_process_expr_stmt(unsigned int, tree_node*) >> ../../gcc/c/c-typeck.c:10369 >> 0x67cfbd c_finish_expr_stmt(unsigned int, tree_node*) >> ../../gcc/c/c-typeck.c:10414 >> 0x6cb578 c_parser_statement_after_labels >> ../../gcc/c/c-parser.c:5430 >> 0x6cd333 c_parser_compound_statement_nostart >> ../../gcc/c/c-parser.c:4944 >> 0x6cdbde c_parser_compound_statement >> ../../gcc/c/c-parser.c:4777 >> 0x6c93ac c_parser_declaration_or_fndef >> ../../gcc/c/c-parser.c:2176 >> 0x6d51ab c_parser_external_declaration >> ../../gcc/c/c-parser.c:1574 >> 0x6d5c09 c_parser_translation_unit >> ../../gcc/c/c-parser.c:1454 >> 0x6d5c09 c_parse_file() >> ../../gcc/c/c-parser.c:18173 >> 0x72ffd2 c_common_parse_file() >> ../../gcc/c-family/c-opts.c:1087 >> >> Following patch improves the host_size_t_cst_p predicate. >> >> Patch can bootstrap on ppc64le-redhat-linux and survives regression >> tests. >> >> Ready to be installed? > > I believe the wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * > CHAR_BIT test is now redundant. > > OTOH it was probably desired to allow -1 here? A little looking back > in time should tell. Ok, it started with r229922, where it was changed from: if (tree_fits_uhwi_p (len) && p1 && p2) { const int i = strncmp (p1, p2, tree_to_uhwi (len)); ... to current version: case CFN_BUILT_IN_STRNCMP: { bool const_size_p = host_size_t_cst_p (arg2, ); Thus I'm suggesting to change to back to it. Ready to be installed? >>> >>> Let's ask Richard. >> >> The idea with the: >> >> wi::min_precision (t, UNSIGNED) <= sizeof (size_t) * CHAR_BIT >> >> test was to stop us attempting 64-bit size_t operations on ILP32 hosts. >> I think we still want that. > > OK, so is the consensus to add tree_fits_uhwi_p predicate to the current > wi::min_precision check, right? Not sure. If we have host_size_t_cst_p then we should have a corresponding size_t host_size_t (const_tree) and should use those in pairs. Not sure why we have sth satisfying host_size_t_cst_p but not tree_fits_uhwi_p. Is that wi::min_precision fault? The way it is documented suggests that it should be equal to tree_fits_uwhi_p ... Richard. > Thanks, > Martin > >> >> Thanks, >> Richard >> >
Re: [committed] Fix bootstrap with ada x86_64-linux and -fcompare-debug failure on ppc64le-linux (PR target/78148)
On 29/10/16 16:57, Jakub Jelinek wrote: On Sat, Oct 29, 2016 at 10:07:22AM +0200, Andreas Schwab wrote: That breaks Ada: a-teioed.adb: In function 'Ada.Text_Io.Editing.Format_Number': a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb:127:4: error: alignment of array elements is greater than element size a-teioed.adb: In function 'Ada.Text_Io.Editing.To_Picture': a-teioed.adb:2724:4: error: alignment of array elements is greater than element size a-teioed.adb: In function 'Ada.Text_Io.Editing.Valid': a-teioed.adb:2751:4: error: alignment of array elements is greater than element size The bug is the same why PR78148 testcase fails with -fcompare-debug, build_nonstandard_integer_type return values are shared, all such calls with the same arguments return the same types, so using SET_TYPE_ALIGN on it is wrong. As it is a bootstrap failure on primary target and the fix is obvious, I've committed it to trunk after bootstrapping/regtesting it on x86_64-linux (where it fixed ada bootstrap) and i686-linux. Thanks for fixing this Jakub! Sorry for the breakage. Kyrill 2016-10-29 Jakub JelinekPR target/78148 * gimple-ssa-store-merging.c (imm_store_chain_info::output_merged_store): Use build_aligned_type instead of SET_TYPE_ALIGN on shared integral type. * gcc.dg/pr78148.c: New test. --- gcc/gimple-ssa-store-merging.c.jj 2016-10-29 14:39:24.0 +0200 +++ gcc/gimple-ssa-store-merging.c 2016-10-29 15:09:34.650749175 +0200 @@ -1130,7 +1130,7 @@ imm_store_chain_info::output_merged_stor location_t loc = get_location_for_stmts (split_store->orig_stmts); tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED); - SET_TYPE_ALIGN (int_type, align); + int_type = build_aligned_type (int_type, align); tree addr = build_fold_addr_expr (base); tree dest = fold_build2 (MEM_REF, int_type, addr, build_int_cst (offset_type, try_pos)); --- gcc/testsuite/gcc.dg/pr78148.c.jj 2016-10-29 15:10:05.432358626 +0200 +++ gcc/testsuite/gcc.dg/pr78148.c 2016-10-29 15:09:09.0 +0200 @@ -0,0 +1,31 @@ +/* PR target/78148 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fcompare-debug" } */ + +struct A { int a, b; }; +struct B { char c, d; }; +extern void bar (struct A, struct B); +struct C { char e, f; } a; +struct D +{ + int g; + struct C h[4]; +}; +struct D *e; + +struct D +foo (void) +{ + int b; + struct B c; + struct A d; + d.b = c.c = c.d = 0; + bar (d, c); +} + +void +baz () +{ + e->h[0].e = e->h[0].f = 0; + foo (); +} Jakub
Re: [PATCH VECT]Support operand swapping for cond_expr in vect_slp
On Fri, Oct 28, 2016 at 1:17 PM, Richard Bienerwrote: > On Thu, Oct 27, 2016 at 3:37 PM, Bin Cheng wrote: >> Hi, >> During analysis, vect_slp checks if statements of a group are isomorphic to >> each other, specifically, all statements have to be isomorphic to the first >> one. Apparently, operands of commutative operators (PLUS_EXPR/MINUS_EXPR >> etc.) could be swapped when checking isomorphic property. Though vect_slp >> has basic support for such commutative operators, the related code is not >> properly implemented: >> 1) vect_build_slp_tree mixes operand swapping in the current slp tree node >> and operand swapping in its child slp tree nodes. >> 2) Operand swapping in the current slp tree is incorrect when >> vect_get_and_check_slp_defs has already committed to a fixed operand order. >> In addition, operand swapping for COND_EXPR is implemented in a wrong way >> (place) because: >> 3) vect_get_and_check_slp_defs swaps operands for COND_EXPR by changing >> comparison code after vect_build_slp_tree_1 checks the code consistency for >> the statement group. >> 4) vect_build_slp_tree_1 should support operand swapping for COND_EXPR >> while it doesn't. >> >> This patch addresses above issues. It supports COND_EXPR by recording >> swapping information in vect_build_slp_tree_1 and applies the swap in >> vect_get_check_slp_defs. It supports two types swapping: swapping and >> inverting. The patch also does refactoring so that operand swapping in >> child slp tree node and the current slp tree node are differentiated. With >> this patch, failures (gcc.dg/vect/slp-cond-3.c) revealed by previous >> COND_EXPR match.pd patches are resolved. >> Bootstrap and test on x86_64 and AArch64. Is it OK? > > Ok, but please re-instantiate the early-out here: Thanks for reviewing. > > @@ -905,18 +960,10 @@ vect_build_slp_tree (vec_info *vinfo, >slp_oprnd_info oprnd_info; >FOR_EACH_VEC_ELT (stmts, i, stmt) > { > - switch (vect_get_and_check_slp_defs (vinfo, stmt, i, _info)) > - { > - case 0: > - break; > - case -1: > - matches[0] = false; > - vect_free_oprnd_info (oprnds_info); > - return NULL; > > > > you seem to needlessly continue checking other DEFs if one returns -1. Yes, updated patch committed. Thanks, bin diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index 5a611e4..62f060c 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -207,14 +207,20 @@ vect_get_place_in_interleaving_chain (gimple *stmt, gimple *first_stmt) /* Get the defs for the rhs of STMT (collect them in OPRNDS_INFO), check that they are of a valid type and that they match the defs of the first stmt of - the SLP group (stored in OPRNDS_INFO). If there was a fatal error - return -1, if the error could be corrected by swapping operands of the - operation return 1, if everything is ok return 0. */ + the SLP group (stored in OPRNDS_INFO). This function tries to match stmts + by swapping operands of STMT when possible. Non-zero *SWAP indicates swap + is required for cond_expr stmts. Specifically, *SWAP is 1 if STMT is cond + and operands of comparison need to be swapped; *SWAP is 2 if STMT is cond + and code of comparison needs to be inverted. If there is any operand swap + in this function, *SWAP is set to non-zero value. + If there was a fatal error return -1; if the error could be corrected by + swapping operands of father node of this one, return 1; if everything is + ok return 0. */ -static int -vect_get_and_check_slp_defs (vec_info *vinfo, +static int +vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char *swap, gimple *stmt, unsigned stmt_num, - vec *oprnds_info) +vec *oprnds_info) { tree oprnd; unsigned int i, number_of_oprnds; @@ -237,11 +243,12 @@ vect_get_and_check_slp_defs (vec_info *vinfo, { enum tree_code code = gimple_assign_rhs_code (stmt); number_of_oprnds = gimple_num_ops (stmt) - 1; + /* Swap can only be done for cond_expr if asked to, otherwise we +could result in different comparison code to the first stmt. */ if (gimple_assign_rhs_code (stmt) == COND_EXPR && COMPARISON_CLASS_P (gimple_assign_rhs1 (stmt))) { first_op_cond = true; - commutative = true; number_of_oprnds++; } else @@ -250,20 +257,24 @@ vect_get_and_check_slp_defs (vec_info *vinfo, else return -1; - bool swapped = false; + bool swapped = (*swap != 0); + gcc_assert (!swapped || first_op_cond); for (i = 0; i < number_of_oprnds; i++) { again: if (first_op_cond) { - if (i == 0 || i == 1) - oprnd = TREE_OPERAND (gimple_op (stmt, first_op_idx), - swapped ? !i : i); + /* Map
Re: [PATCH] add a gimple test for PR21458
On Mon, Oct 31, 2016 at 8:26 AM,wrote: > From: Trevor Saunders > > A demonstration we can do the same thing with a gimple test as -fno-tree-evrp > but somewhat more precisely. > > I tested this passes on x86_64-linux-gnu, ok? Looks good to me with the indentation fixed (once the FE is on trunk). Note that as the startwith 'vrp1' doesn't necessarily correspond to the dumpfile 'vrp1' the pass specification might be somewhat fragile... it would require some more refactoring to make the dumpfile name of a pass easily accessible from struct pass * to make a match to that (but then the startwith choice is to make it work with any pass re-ordering as well). I think we support just -fdump-tree-vrp and using "vrp" in the scan-tree-dump-times line as well? Though I don't remember what the semantics is of those if we have multiple vrp dumps. Richard. > Trev > > gcc/testsuite/ChangeLog: > > 2016-10-31 Trevor Saunders > > * gcc.dg/tree-ssa/pr21458-3.c: New test. > --- > gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c | 40 > +++ > 1 file changed, 40 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c > new file mode 100644 > index 000..6433a7e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21458-3.c > @@ -0,0 +1,40 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fgimple -fdump-tree-vrp1" } */ > + > +extern void g (void); > +extern void bar (int); > + > +int __GIMPLE(startwith("vrp1")) > +foo (int a) > +{ > +int i; > + > + bb_2: > + i_4 = 1; > +goto bb_6; > + > + bb_3: > + if (i_1 != 0) > + goto bb_4; > + else > + goto bb_5; > + > + bb_4: > + g (); > + > + bb_5: > + i_7 = i_1 + 1; > + > + bb_6: > + i_1 = __PHI (bb_2: i_4, bb_5: i_7); > + if (i_1 <= 99) > + goto bb_3; > + else > + goto bb_7; > + > + bb_7: > + return; > + > +} > + > +/* { dg-final { scan-tree-dump-times "Folding predicate.*to 1" 1 "vrp1" } } > */ > -- > 2.9.3.dirty >
Re: [Patch, rtl] PR middle-end/78016, keep REG_NOTE order during insn copy
On 21/10/16 13:30, Bernd Schmidt wrote: On 10/21/2016 02:04 PM, Jiong Wang wrote: + /* Locate the end of existing REG_NOTES in NEW_RTX. */ + rtx *ptail = _NOTES (new_rtx); + while (*ptail != NULL_RTX) +ptail = (*ptail, 1); I was thinking along the lines of something like this (untested, emit-rtl.c part omitted). Eric can choose whether he likes either of these or wants something else. Hi Eric, What's your decision on this? Thanks. Regards, Jiong Bernd Index: gcc/rtl.h === --- gcc/rtl.h(revision 241233) +++ gcc/rtl.h(working copy) @@ -3008,6 +3008,7 @@ extern rtx alloc_reg_note (enum reg_note extern void add_reg_note (rtx, enum reg_note, rtx); extern void add_int_reg_note (rtx, enum reg_note, int); extern void add_shallow_copy_of_reg_note (rtx_insn *, rtx); +extern rtx duplicate_reg_note (rtx_insn *, rtx); extern void remove_note (rtx, const_rtx); extern void remove_reg_equal_equiv_notes (rtx_insn *); extern void remove_reg_equal_equiv_notes_for_regno (unsigned int); Index: gcc/rtlanal.c === --- gcc/rtlanal.c(revision 241233) +++ gcc/rtlanal.c(working copy) @@ -2304,6 +2304,21 @@ add_shallow_copy_of_reg_note (rtx_insn * add_reg_note (insn, REG_NOTE_KIND (note), XEXP (note, 0)); } +/* Duplicate NOTE and return the copy. */ +rtx +duplicate_reg_note (rtx note) +{ + rtx n; + reg_note_kind kind = REG_NOTE_KIND (note); + + if (GET_CODE (note) == INT_LIST) +return gen_rtx_INT_LIST ((machine_mode) kind, XINT (note, 0), NULL_RTX); + else if (GET_CODE (note) == EXPR_LIST) +return alloc_reg_note (kind, copy_insn_1 (XEXP (note, 0)), NULL_RTX); + else +return alloc_reg_note (kind, XEXP (note, 0), NULL_RTX); +} + /* Remove register note NOTE from the REG_NOTES of INSN. */ void Index: gcc/sel-sched-ir.c === --- gcc/sel-sched-ir.c(revision 241233) +++ gcc/sel-sched-ir.c(working copy) @@ -5762,6 +5762,11 @@ create_copy_of_insn_rtx (rtx insn_rtx) res = create_insn_rtx_from_pattern (copy_rtx (PATTERN (insn_rtx)), NULL_RTX); + /* Locate the end of existing REG_NOTES in NEW_RTX. */ + rtx *ptail = _NOTES (new_rtx); + while (*ptail != NULL_RTX) +ptail = (*ptail, 1); + /* Copy all REG_NOTES except REG_EQUAL/REG_EQUIV and REG_LABEL_OPERAND since mark_jump_label will make them. REG_LABEL_TARGETs are created there too, but are supposed to be sticky, so we copy them. */ @@ -5770,11 +5775,8 @@ create_copy_of_insn_rtx (rtx insn_rtx) && REG_NOTE_KIND (link) != REG_EQUAL && REG_NOTE_KIND (link) != REG_EQUIV) { -if (GET_CODE (link) == EXPR_LIST) - add_reg_note (res, REG_NOTE_KIND (link), -copy_insn_1 (XEXP (link, 0))); -else - add_reg_note (res, REG_NOTE_KIND (link), XEXP (link, 0)); +*ptail = duplicate_reg_note (link); +ptail = (*ptail, 1); } return res;
[Ping][gcc] Enable DW_CFA_val_expression support in dwarf module
On 21/10/16 09:30, Jiong Wang wrote: Currently, GCC only support DW_CFA_expression in dwarf module, this patch extend the support to DW_CFA_val_expression which share the same code mostly the same code with DW_CFA_expression. Meanwhile the existed dwarf expression parser only allows expressions which can be represented using GCC RTL. If one operation doesn't have a correspondent GCC RTL operator, then there is no way to attach that information in reg-note. This patch extends the current dwarf expression support to unlimited number of operations by using PARALLEL, and unlimited type of operations by using UNSPEC. All DW_OP_* of the expression are grouped together inside the PARALLEL, and those operations which don't have RTL mapping are wrapped by UNSPEC. The parsing algorithm is simply something like: foreach elem inside PARALLEL if (UNSPEC) { dw_op_code = INTVAL (XVECEXP (elem, 0, 0)); oprnd1 = INTVAL (XVECEXP (elem, 0, 1)); oprnd2 = INTVAL (XVECEXP (elem, 0, 2)); } else call standard RTL parser. Any comments on the approach? Ping ~ Thanks. gcc/ 2016-10-20 Jiong Wang* reg-notes.def (CFA_VAL_EXPRESSION): New entry. * dwarf2cfi.c (dwarf2out_frame_debug_cfa_val_expression): New function. (dwarf2out_frame_debug): Support REG_CFA_VAL_EXPRESSION. (output_cfa_loc): Support DW_CFA_val_expression. (output_cfa_loc_raw): Likewise. (output_cfi): Likewise. (output_cfi_directive): Likewise. * dwarf2out.c (dw_cfi_oprnd1_desc): Support DW_CFA_val_expression. (dw_cfi_oprnd2_desc): Likewise. (mem_loc_descriptor): Recognize new pattern generated for value expression.
Re: [PATCH] DWARF5 .debug_info headers, .debug_types -> .debug_info DW_UT_type
On Sun, Oct 30, 2016 at 08:31:47PM +0100, Jan Kratochvil wrote: > On Fri, 21 Oct 2016 21:32:42 +0200, Jakub Jelinek wrote: > > This patch changes the .debug_info headers to follow the current > > specification (I still hope the useless padding1/padding2 fields will be > > removed), and also changes the -gsplit-dwarf stuff to move dwo_id into > > the header and use DW_UT_{skeleton,split_*}. > > During GDB consumer patch testing I needed this fix on top of your patch. Thanks for noticing it. As I had no consumer, all I did was eyeballing -gdwarf-5 -dA output, and haven't noticed this. > diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c > index 7c6a9e9..1c0ca35 100644 > --- a/gcc/dwarf2out.c > +++ b/gcc/dwarf2out.c > @@ -2977,7 +2977,8 @@ skeleton_chain_node; > > /* Fixed size portion of the DWARF compilation unit header. */ > #define DWARF_COMPILE_UNIT_HEADER_SIZE \ > - (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE + 3) > + (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE \ > + + (dwarf_version < 5 ? 3 : 4 + 8 + DWARF_OFFSET_SIZE)) I'd use DWARF_TYPE_SIGNATURE_SIZE instead of the 8 to match the define below it. > /* Fixed size portion of the DWARF comdat type unit header. */ > #define DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE \ And this define needs adjusting too. I still hope we can get rid of the useless padding for the most common DW_UT_* types, I've raised it on dwarf workgroup and got some positive reactions, but will now file it officially as public review issue. 2016-10-31 Jan KratochvilJakub Jelinek * dwarf2out.c (DWARF_COMPILE_UNIT_HEADER_SIZE): Adjust for -gdwarf-5. (DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE): Likewise. --- gcc/dwarf2out.c.jj 2016-10-27 15:57:24.0 +0200 +++ gcc/dwarf2out.c 2016-10-31 10:12:20.338761079 +0100 @@ -2805,12 +2805,15 @@ skeleton_chain_node; /* Fixed size portion of the DWARF compilation unit header. */ #define DWARF_COMPILE_UNIT_HEADER_SIZE \ - (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE + 3) + (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE \ + + (dwarf_version >= 5 \ + ? 4 + DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE : 3)) /* Fixed size portion of the DWARF comdat type unit header. */ #define DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE \ - (DWARF_COMPILE_UNIT_HEADER_SIZE + DWARF_TYPE_SIGNATURE_SIZE \ - + DWARF_OFFSET_SIZE) + (DWARF_COMPILE_UNIT_HEADER_SIZE \ + + (dwarf_version >= 5 \ + ? 0 : DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE)) /* Fixed size portion of public names info. */ #define DWARF_PUBNAMES_HEADER_SIZE (2 * DWARF_OFFSET_SIZE + 2) Jakub
[PATCH] Don't use priority {cd}tors if not supported by a target (PR, gcov-profile/78086)
Hi. Using priority {cd}tors on a target that does not support that can cause failures (see the PR). Apart from that, I decided to use priority 100 for both gcov_init and gcov_exit functions as the reserved range includes priority 100. Moreover, I enhanced test-cases we have. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin >From 05a0dcb13d608facdd1c85f4101cd821634d07cd Mon Sep 17 00:00:00 2001 From: marxinDate: Wed, 26 Oct 2016 12:50:35 +0200 Subject: [PATCH] Don't use priority {cd}tors if not supported by a target (PR gcov-profile/78086) gcc/testsuite/ChangeLog: 2016-10-26 Martin Liska * g++.dg/gcov/pr16855.C: Clean up the test case. * g++.dg/gcov/pr16855-priority.C: New test. gcc/ChangeLog: 2016-10-26 Martin Liska * coverage.c (build_init_ctor): Don't use priority {cd}tors if not supported by a target. Set priority to 100 if possible. (build_gcov_exit_decl): Likewise. --- gcc/coverage.c | 13 +++-- gcc/testsuite/g++.dg/gcov/pr16855-priority.C | 79 gcc/testsuite/g++.dg/gcov/pr16855.C | 55 +-- 3 files changed, 116 insertions(+), 31 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gcov/pr16855-priority.C diff --git a/gcc/coverage.c b/gcc/coverage.c index 8810710..4167e26 100644 --- a/gcc/coverage.c +++ b/gcc/coverage.c @@ -1056,8 +1056,10 @@ build_init_ctor (tree gcov_info_type) stmt = build_call_expr (init_fn, 1, stmt); append_to_statement_list (stmt, ); - /* Generate a constructor to run it (with priority 99). */ - cgraph_build_static_cdtor ('I', ctor, DEFAULT_INIT_PRIORITY - 1); + /* Generate a constructor to run it. */ + int priority = SUPPORTS_INIT_PRIORITY +? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY; + cgraph_build_static_cdtor ('I', ctor, priority); } /* Generate the destructor function to call __gcov_exit. */ @@ -1078,8 +1080,11 @@ build_gcov_exit_decl (void) tree stmt = build_call_expr (init_fn, 0); append_to_statement_list (stmt, ); - /* Generate a destructor to run it (with priority 99). */ - cgraph_build_static_cdtor ('D', dtor, MAX_RESERVED_INIT_PRIORITY - 1); + /* Generate a destructor to run it. */ + int priority = SUPPORTS_INIT_PRIORITY +? MAX_RESERVED_INIT_PRIORITY: DEFAULT_INIT_PRIORITY; + + cgraph_build_static_cdtor ('D', dtor, priority); } /* Create the gcov_info types and object. Generate the constructor diff --git a/gcc/testsuite/g++.dg/gcov/pr16855-priority.C b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C new file mode 100644 index 000..7e39565 --- /dev/null +++ b/gcc/testsuite/g++.dg/gcov/pr16855-priority.C @@ -0,0 +1,79 @@ +/* { dg-options "-fprofile-arcs -ftest-coverage" } */ +/* { dg-do run { target native } } */ +/* { dg-require-effective-target init_priority } */ + +#include +#include + +int a; + +void +foo () +{ + fprintf (stderr, "In foo\n"); + a = 123; /* count(1) */ +} + +using namespace std; +class Test +{ +public: + Test (void) { fprintf (stderr, "In Test::Test\n"); /* count(1) */ } + ~Test (void) { fprintf (stderr, "In Test::~Test\n"); /* count(1) */ } +} T1; + +void +uncalled (void) +{ + fprintf (stderr, "In uncalled\n"); /* count(#) */ +} + +int +main (void) +{ + atexit (); + fprintf (stderr, "In main\n"); /* count(1) */ + return 0; +} + +static void __attribute__ ((constructor)) ctor_default () +{ + fprintf (stderr, "in constructor(())\n"); /* count(1) */ +} + +static void __attribute__ ((constructor ((101 ctor_100 () +{ + fprintf (stderr, "in constructor((101))\n"); /* count(1) */ +} + +static void __attribute__ ((constructor ((500 ctor_500 () +{ + fprintf (stderr, "in constructor((500))\n"); /* count(1) */ +} + +static void __attribute__ ((constructor ((65535 ctor_65535 () +{ + fprintf (stderr, "in constructor((65535))\n"); /* count(1) */ +} + +static void __attribute__ ((destructor)) dtor_default () +{ + fprintf (stderr, "in destructor(())\n"); /* count(1) */ +} + +static void __attribute__ ((destructor ((101 dtor_100 () +{ + fprintf (stderr, "in destructor((101))\n"); /* count(1) */ +} + +static void __attribute__ ((destructor ((500 dtor_500 () +{ + fprintf (stderr, "in destructor((500))\n"); /* count(1) */ +} + +static void __attribute__ ((destructor ((65535 dtor_65535 () +{ + fprintf (stderr, "in destructor((65535))\n"); /* count(1) */ +} + +/* { dg-final { run-gcov branches { -b pr16855-priority.C } } } */ diff --git a/gcc/testsuite/g++.dg/gcov/pr16855.C b/gcc/testsuite/g++.dg/gcov/pr16855.C index 91801d4..d7aa8a4 100644 --- a/gcc/testsuite/g++.dg/gcov/pr16855.C +++ b/gcc/testsuite/g++.dg/gcov/pr16855.C @@ -2,46 +2,47 @@ /* { dg-do run { target native } } */ #include +#include int a; -void foo() +void +foo () { - a = 123; /* count(1) */ + fprintf (stderr, "In foo\n"); + a = 123; /* count(1) */ } -#include
Re: [PATCH] Introduce -fprofile-update=maybe-atomic
PING^1 On 10/13/2016 05:34 PM, Martin Liška wrote: > Hello. > > As it's very hard to guess from GCC driver whether a target supports atomic > updates > for GCOV counter or not, I decided to come up with a new option value > (maybe-atomic), > that would be transformed in a corresponding value (single or atomic) in > tree-profile.c. > The GCC driver selects the option when -pthread is present in the command > line. > > That should fix all tests failures seen on AIX target. > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Ready to be installed? > Martin >