Re: [PATCH][gcc] libgccjit: introduce gcc_jit_context_add_driver_option
On Fri, 2019-01-18 at 19:25 +, Andrea Corallo wrote: > Hi all, > this patch add gcc_jit_context_add_driver_option to the libgccjit ABI > and a testcase for it. > > Using this interface is now possible to pass options affecting > assembler and linker. > > Does not introduce any new regression running make check-jit. > > Bests > > Andrea > > > gcc/jit/ChangeLog > 2019-01-16 Andrea Corallo andrea.cora...@arm.com > > * docs/topics/compatibility.rst (LIBGCCJIT_ABI_11): New ABI tag. > * docs/topics/contexts.rst (Additional driver options): New > section. > * jit-playback.c (invoke_driver): Add call to append_driver_options. > * jit-recording.c: Within namespace gcc::jit... > (recording::context::~context): Free the optnames within > m_driver_options. > (recording::context::add_driver_option): New method. > (recording::context::append_driver_options): New method. > (recording::context::dump_reproducer_to_file): Add driver > options. > * jit-recording.h: Within namespace gcc::jit... > (recording::context::add_driver_option): New method. > (recording::context::append_driver_options): New method. > (recording::context::m_driver_options): New field. > * libgccjit++.h (gccjit::context::add_driver_option): New > method. > * libgccjit.c (gcc_jit_context_add_driver_option): New API > entrypoint. > * libgccjit.h (gcc_jit_context_add_driver_option): New API > entrypoint. > (LIBGCCJIT_HAVE_gcc_jit_context_add_driver_option): New > macro. > * libgccjit.map (LIBGCCJIT_ABI_11): New ABI tag. > > > gcc/testsuite/ChangeLog > 2019-01-16 Andrea Corallo andrea.cora...@arm.com > > * jit.dg/add-driver-options-testlib.c: Add support file for > test-add-driver-options.c testcase. > * jit.dg/all-non-failing-tests.h: Add test-add-driver-options.c > * jit.dg/jit.exp (jit-dg-test): Update to support > add-driver-options-testlib.c compilation. > * jit.dg/test-add-driver-options.c: New testcase. Thanks for this patch. One nit: [...snip...] > diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h > b/gcc/testsuite/jit.dg/all-non-failing-tests.h > index bf02e12..9f816b4 100644 > --- a/gcc/testsuite/jit.dg/all-non-failing-tests.h > +++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h > @@ -251,6 +251,13 @@ > #undef create_code > #undef verify_code > > +/* test-add-driver-options.c */ > +#define create_code create_code_add_driver_options > +#define verify_code verify_code_add_driver_options > +#include "test-add-driver-options.c" > +#undef create_code > +#undef verify_code > + > /* Now expose the individual testcases as instances of this struct. */ > > struct testcase The purpose of the above file is to allow for copies of tests to be built into test-combination.c and test-threads.c (to shake out state- handling bugs). If you're going to embed the test into those, then they'd also need to be added to the "testcases" array towards the end of that header. But given that the new test adds options that affect the whole context, it's probably best not to embed it into those combined tests. Instead, add a comment to all-non-failing-tests.h, similar to the one that reads: /* test-extra-options.c: We don't use this one, since the extra options affect the whole context. */ (changing the filename, of course). Other than that the patch looks good. Do you have your copyright assignment paperwork in place? Also, we're currently in stage 4 of development for gcc 9, so adding a feature to libgccjit probably requires Release Manager approval. (Given the recent discussion on the jit mailing list, this might not be the only late-breaking jit patch). Dave
Re: [PATCH] rs6000: Add missing prototypes for vec_ld/vec_st
On Fri, Jan 18, 2019 at 11:15:12PM +0100, Jakub Jelinek wrote: > On Wed, Jan 16, 2019 at 10:08:46PM +0800, Kewen.Lin wrote: > > * gcc.target/powerpc/altivec_vld_vst_addr.c: New test. > > This test fails on powerpc64-linux, both with -m32 and -m64: I missed that. Thanks Jakub. Kewen, please split the "vector long long" tests to a separate testcase, and use -mvsx and powerpc_vsx_ok there (instead of -maltivec and powerpc_altivec_ok). Segher
[committed] remove xfail from attr-nonstring-3.c
The test started passing with the fix for bug 88693 (r267852). I've just removed the xfail via r268090 (copied below). As the original shows, the failure had been misdiagnosed and mislabeled as due to the still unresolved bug 86688. Martin Index: gcc/testsuite/c-c++-common/attr-nonstring-3.c === --- gcc/testsuite/c-c++-common/attr-nonstring-3.c (revision 268086) +++ gcc/testsuite/c-c++-common/attr-nonstring-3.c (working copy) @@ -406,7 +406,7 @@ void test_strlen (struct MemArrays *p, char *s NON { char a[] __attribute__ ((nonstring)) = { 1, 2, 3 }; -T (strlen (a)); /* { dg-warning "argument 1 declared attribute .nonstring." "pr86688" { xfail *-*-* } } */ +T (strlen (a)); /* { dg-warning "argument 1 declared attribute .nonstring." } */ } {
Re: [PATCH 9/9] [libbacktrace] Add printdwarftest_dwz_cmp.sh test-case
On Fri, Jan 18, 2019 at 4:45 PM Tom de Vries wrote: > > On 18-01-19 15:23, Ian Lance Taylor wrote: > > On Thu, Jan 17, 2019 at 5:59 AM Tom de Vries wrote: > >> > >> now that the rest of the patch series has been committed, here's an > >> updated version of this patch that applies to trunk. > > > > I would much rather put dwarf_data into internal.h than to #include > > "dwarf.c" from a different file. Using #include with a .c file is > > just a bad path to walk down. > > This version avoids the include of dwarf.c. > > Does that look better? > +printdwarftest_SOURCES = > +printdwarftest_LDADD = libbacktrace.la printdwarftest.lo testlib.lo Seems like you could write printdwarftest_SOURCES = printdwarftest.c testlib.c printdwarftest_LDADD = libbacktrace.la > -static int > +int > dwarf_lookup_pc (struct backtrace_state *state, struct dwarf_data *ddata, Ah, I didn't consider this. We can't do this. It will break code like libsanitizer/libbacktrace/backtrace-rename.h. Is there a way that we could run a similar test looking at the output of readelf --debug? Ian
Re: [PATCH 9/9] [libbacktrace] Add printdwarftest_dwz_cmp.sh test-case
On 18-01-19 15:23, Ian Lance Taylor wrote: > On Thu, Jan 17, 2019 at 5:59 AM Tom de Vries wrote: >> >> now that the rest of the patch series has been committed, here's an >> updated version of this patch that applies to trunk. > > I would much rather put dwarf_data into internal.h than to #include > "dwarf.c" from a different file. Using #include with a .c file is > just a bad path to walk down. This version avoids the include of dwarf.c. Does that look better? Thanks, - Tom [libbacktrace] Add printdwarftest_dwz_cmp.sh test-case Add test-case that verifies that libbacktrace can find the same debug information with and without dwz compression. 2018-12-10 Tom de Vries * Makefile.am (TESTS): Add printdwarftest_dwz_cmp.sh. * Makefile.in: Regenerate. * printdwarftest.c: New file. * printdwarftest_dwz_cmp.sh: New file. * dwarf.c (struct function_vector, struct unit_addrs) (struct dwarf_info): Move ... * internal.h: ... here. --- libbacktrace/Makefile.am | 11 ++ libbacktrace/Makefile.in | 70 --- libbacktrace/dwarf.c | 64 +- libbacktrace/internal.h| 68 +++ libbacktrace/printdwarftest.c | 208 + libbacktrace/printdwarftest_dwz_cmp.sh | 8 ++ 6 files changed, 348 insertions(+), 81 deletions(-) diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am index bf90ebdb2d5..7843229304d 100644 --- a/libbacktrace/Makefile.am +++ b/libbacktrace/Makefile.am @@ -190,6 +190,15 @@ if HAVE_DWZ TESTS += btest_dwz +printdwarftest_SOURCES = +printdwarftest_LDADD = libbacktrace.la printdwarftest.lo testlib.lo + +check_PROGRAMS += printdwarftest + +printdwarftest_dwz_cmp.sh: printdwarftest_dwz + +TESTS += printdwarftest_dwz_cmp.sh + endif HAVE_DWZ stest_SOURCES = stest.c @@ -319,11 +328,13 @@ nounwind.lo: config.h internal.h pecoff.lo: config.h backtrace.h internal.h posix.lo: config.h backtrace.h internal.h print.lo: config.h backtrace.h internal.h +printdwarftest.lo: config.h backtrace.h internal.h testlib.h read.lo: config.h backtrace.h internal.h simple.lo: config.h backtrace.h internal.h sort.lo: config.h backtrace.h internal.h stest.lo: config.h backtrace.h internal.h state.lo: config.h backtrace.h backtrace-supported.h internal.h +testlib.lo: $(INCDIR)/filenames.h backtrace.h testlib.h unknown.lo: config.h backtrace.h internal.h xcoff.lo: config.h backtrace.h internal.h diff --git a/libbacktrace/Makefile.in b/libbacktrace/Makefile.in index d55e0501171..427a0c36161 100644 --- a/libbacktrace/Makefile.in +++ b/libbacktrace/Makefile.in @@ -120,18 +120,22 @@ POST_UNINSTALL = : build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ -check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) +check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) \ + $(am__EXEEXT_4) $(am__EXEEXT_5) @NATIVE_TRUE@am__append_1 = test_elf test_xcoff_32 test_xcoff_64 \ @NATIVE_TRUE@ test_pecoff test_unknown unittest unittest_alloc \ -@NATIVE_TRUE@ allocfail btest btest_alloc stest stest_alloc \ -@NATIVE_TRUE@ ztest ztest_alloc edtest edtest_alloc +@NATIVE_TRUE@ allocfail btest btest_alloc @NATIVE_TRUE@am__append_2 = allocfail.sh -@HAVE_DWZ_TRUE@@NATIVE_TRUE@am__append_3 = btest_dwz -@HAVE_ZLIB_TRUE@@NATIVE_TRUE@am__append_4 = -lz -@HAVE_ZLIB_TRUE@@NATIVE_TRUE@am__append_5 = -lz -@HAVE_PTHREAD_TRUE@@NATIVE_TRUE@am__append_6 = ttest ttest_alloc -@HAVE_OBJCOPY_DEBUGLINK_TRUE@@NATIVE_TRUE@am__append_7 = dtest -@HAVE_COMPRESSED_DEBUG_TRUE@@NATIVE_TRUE@am__append_8 = ctestg ctesta \ +@HAVE_DWZ_TRUE@@NATIVE_TRUE@am__append_3 = btest_dwz \ +@HAVE_DWZ_TRUE@@NATIVE_TRUE@ printdwarftest_dwz_cmp.sh +@HAVE_DWZ_TRUE@@NATIVE_TRUE@am__append_4 = printdwarftest +@NATIVE_TRUE@am__append_5 = stest stest_alloc ztest ztest_alloc edtest \ +@NATIVE_TRUE@ edtest_alloc +@HAVE_ZLIB_TRUE@@NATIVE_TRUE@am__append_6 = -lz +@HAVE_ZLIB_TRUE@@NATIVE_TRUE@am__append_7 = -lz +@HAVE_PTHREAD_TRUE@@NATIVE_TRUE@am__append_8 = ttest ttest_alloc +@HAVE_OBJCOPY_DEBUGLINK_TRUE@@NATIVE_TRUE@am__append_9 = dtest +@HAVE_COMPRESSED_DEBUG_TRUE@@NATIVE_TRUE@am__append_10 = ctestg ctesta \ @HAVE_COMPRESSED_DEBUG_TRUE@@NATIVE_TRUE@ ctestg_alloc \ @HAVE_COMPRESSED_DEBUG_TRUE@@NATIVE_TRUE@ ctesta_alloc subdir = . @@ -184,13 +188,14 @@ libbacktrace_noformat_la_OBJECTS = \ @NATIVE_TRUE@ test_xcoff_64$(EXEEXT) test_pecoff$(EXEEXT) \ @NATIVE_TRUE@ test_unknown$(EXEEXT) unittest$(EXEEXT) \ @NATIVE_TRUE@ unittest_alloc$(EXEEXT) allocfail$(EXEEXT) \ -@NATIVE_TRUE@ btest$(EXEEXT) btest_alloc$(EXEEXT) \ -@NATIVE_TRUE@ stest$(EXEEXT) stest_alloc$(EXEEXT) \ +@NATIVE_TRUE@ btest$(EXEEXT) btest_alloc$(EXEEXT) +@HAVE_DWZ_TRUE@@NATIVE_TRUE@am__EXEEXT_2 = printdwarftest$(EXEEXT) +@NATIVE_TRUE@am__EXEEXT_3 = stest$(EXEEXT) stest_alloc$(EXEEXT) \ @NATIVE_TRUE@ ztest$(EXEEXT) ztest_alloc$(EXEEXT) \ @NATIVE_TRUE@ edtest$(EXEEXT) edtest_alloc$(EXEEXT)
Re: [PATCH] avoid issuing -Warray-bounds during folding (PR 88800)
On 1/18/19 5:24 AM, Rainer Orth wrote: Hi Christophe, After your commit (r268037), I'm seeing excess errors on some arm targets: FAIL: c-c++-common/Wrestrict.c -Wc++-compat (test for excess errors) Excess errors: /gcc/testsuite/c-c++-common/Wrestrict.c:195:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] /gcc/testsuite/c-c++-common/Wrestrict.c:202:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] /gcc/testsuite/c-c++-common/Wrestrict.c:207:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] I'm seeing the same on sparc-sun-solaris2.*, both 32 and 64-bit. Test results for x86_64-w64-mingw32 and ia64-suse-linux-gnu show the same failure. Besides (and probably caused by the same revision), I now get +XPASS: c-c++-common/Warray-bounds-3.c -std=gnu++14 bug (test for warnings, line 161) +XPASS: c-c++-common/Warray-bounds-3.c -std=gnu++17 bug (test for warnings, line 161) +XPASS: c-c++-common/Warray-bounds-3.c -std=gnu++98 bug (test for warnings, line 161) +XPASS: c-c++-common/Warray-bounds-3.c -Wc++-compat bug (test for warnings, line 161) which is also seen on ia64-suse-linux-gnu. I think this is the same problem as the one on arm. The bigger patch I posted should take care of it as well. Martin Index: gcc/testsuite/c-c++-common/Warray-bounds-3.c === --- gcc/testsuite/c-c++-common/Warray-bounds-3.c(revision 268082) +++ gcc/testsuite/c-c++-common/Warray-bounds-3.c(working copy) @@ -158,7 +158,7 @@ void test_memcpy_overflow (char *d, const char *s, but known access size is detected. This works except with small sizes that are powers of 2 due to bug . */ T (char, 1, arr + SR (DIFF_MAX - 1, DIFF_MAX), s, 1); - T (char, 1, arr + SR (DIFF_MAX - 1, DIFF_MAX), s, 2); /* { dg-warning "pointer overflow between offset \\\[\[0-9\]+, \[0-9\]+] and size 2 accessing array " "bug " { xfail *-*-* } } */ + T (char, 1, arr + SR (DIFF_MAX - 1, DIFF_MAX), s, 2); /* { dg-warning "pointer overflow between offset \\\[\[0-9\]+, \[0-9\]+] and size 2 accessing array " "bug " { xfail fold_memcpy_2 } } */ T (char, 1, arr + SR (DIFF_MAX - 2, DIFF_MAX), s, 3); /* { dg-warning "pointer overflow between offset \\\[\[0-9\]+, \[0-9\]+] and size 3 accessing array " "memcpy" } */ T (char, 1, arr + SR (DIFF_MAX - 4, DIFF_MAX), s, 5); /* { dg-warning "pointer overflow between offset \\\[\[0-9\]+, \[0-9\]+] and size 5 accessing array " "memcpy" } */ }
Re: [PATCH] avoid issuing -Warray-bounds during folding (PR 88800)
On 1/18/19 2:35 AM, Christophe Lyon wrote: Hi Martin, On Thu, 17 Jan 2019 at 02:51, Martin Sebor wrote: On 1/16/19 6:14 PM, Jeff Law wrote: On 1/15/19 8:21 AM, Martin Sebor wrote: On 1/15/19 4:07 AM, Richard Biener wrote: On Tue, Jan 15, 2019 at 1:08 AM Martin Sebor wrote: The gimple_fold_builtin_memory_op() function folds calls to memcpy and similar to MEM_REF when the size of the copy is a small power of 2, but it does so without considering whether the copy might write (or read) past the end of one of the objects. To detect these kinds of errors (and help distinguish them from -Westrict) the folder calls into the wrestrict pass and lets it diagnose them. Unfortunately, that can lead to false positives for even some fairly straightforward code that is ultimately found to be unreachable. PR 88800 is a report of one such problem. To avoid these false positives the attached patch adjusts the function to avoid issuing -Warray-bounds for out-of-bounds calls to memcpy et al. Instead, the patch disables the folding of such invalid calls (and only those). Those that are not eliminated during DCE or other subsequent passes are eventually diagnosed by the wrestrict pass. Since this change required removing the dependency of the detection on the warning options (originally done as a micro-optimization to avoid spending compile-time cycles on something that wasn't needed) the patch also adds tests to verify that code generation is not affected as a result of warnings being enabled or disabled. With the patch as is, the invalid memcpy calls end up emitted (currently they are folded into equally invalid MEM_REFs). At some point, I'd like us to consider whether they should be replaced with traps (possibly under the control of as has been proposed a number of times in the past. If/when that's done, these tests will need to be adjusted to look for traps instead. Tested on x86_64-linux. I've said in the past that I feel delaying of folding is wrong. To understand, the PR is about emitting a warning for out-of-bound accesses in a dead code region? Yes. I am keeping in my mind your preference of not delaying the folding of valid code. If we think delaying/disablign the folding is the way to go the patch looks OK. I do, at least for now. I'm taking this as your approval to commit the patch (please let me know if you didn't mean it that way). Note we are in stage4, so we're supposed to be addressing regression bugfixes and documentation issues. So I think Richi needs to be explicit about whether or not he wants this in gcc-9 or if it should defer to gcc-10. I have no technical objections to the patch and would easily ack it in stage1 or stage3. The warning is a regression introduced in GCC 8. I was just about to commit the fix so please let me know if I should hold off until stage 1. After your commit (r268037), I'm seeing excess errors on some arm targets: FAIL: c-c++-common/Wrestrict.c -Wc++-compat (test for excess errors) Excess errors: /gcc/testsuite/c-c++-common/Wrestrict.c:195:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] /gcc/testsuite/c-c++-common/Wrestrict.c:202:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] /gcc/testsuite/c-c++-common/Wrestrict.c:207:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] This is not true for all arm toolchains, so for instance if you want to reproduce it, you can build for target arm-eabi and keep default cpu/fpu/mode. The warnings are valid, the test just hardcodes the wrong byte counts in the xfailed dg-warning directives. I've fixed the byte counts so that the test just shows XPASSes. The other issue here is that the -Wrestrict warning only triggers for built-ins and whether GCC keeps those around or folds them to MEM_REFs depends on the target. On common targets, a memcpy (d, d + 2, 4) call, for instance, (i.e., one with a small power-of-2 size) is folded to MEM_REF, so there is no -Wrestrict warning despite the overlap. Strictly, it's a false negative, but in reality it's not a problem because GCC gives the MEM_REF copy the same safe semantics as with memmove, so the overlap is benign. But on targets that optimize for space by default (like arm-eabi) the folding doesn't happen, memcpy gets called for the overlapping regions, and we get the helpful warning. If there was a way to tell at compile time which target the test is being compiled for, whether a folding or non-folding one, that would give us a way to conditionalize the dg-warnings and avoid these pesky regressions. I just posted a patch to do that so if it's approved, these failures should all be resolved. Ultimately, though, I'd like to make the warnings detect invalid accesses in MEM_REFs as much as in built-in calls, so this should be just
[PATCH] introduce effective-target fold_memcpy
Some of the -Warray-bounds and -Wrestrict tests are prone to failing on targets like arm-eabi and others that use different parameters to decide which memcpy calls should be folded to MEM_REF (those that do are copies of small power-of-two sizes where the power tends to vary from target to target and may be as little as 1). The failures then waste the time of those who maintain those secondary targets reporting failures (see * below), as well as those who wrote the tests debugging the problems and working around them. To reduce this effort (and ideally avoid these regressions going forward) the attached patch adds a new effective-target to the test harness: fold_memcpy_N. It detects the target's willingness to fold memcpy call of the given size (N). While testing this with the arm cross-compiler I also tweaked the tests that #include standard headers to only do so when __has_include says the header exists. This lets the tests pass even when using a cross-compiler without library headers installed (my default MO). If/when the warnings are improved to detect the problems regardless of the folding as I'm hoping to eventually do, this new effective-target feature can be removed. Martin [*] https://gcc.gnu.org/ml/gcc-patches/2019-01/msg01056.html gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_compile): Support -ftree-dump-xxx options. (check_effective_target_fold_memcpy): New function. (check_effective_target_fold_memcpy_2): Same. (check_effective_target_fold_memcpy_4): Same. (check_effective_target_fold_memcpy_8): Same. * c-c++-common/Warray-bounds-2.c: Include headers only if they exist. * c-c++-common/Warray-bounds-3.c: Make xfails conditional on target fold_memcpy. * c-c++-common/Wrestrict-2.c: Include headers only if they exist. * c-c++-common/Wrestrict.c: Make xfails conditional on target fold_memcpy. Index: gcc/testsuite/lib/target-supports.exp === --- gcc/testsuite/lib/target-supports.exp (revision 268082) +++ gcc/testsuite/lib/target-supports.exp (working copy) @@ -82,6 +82,11 @@ proc check_compile {basename type contents args} { lappend options "additional_flags=-fdump-$type" set compile_type assembly } + "tree-*" { + set output ${basename}[pid].s + lappend options "additional_flags=-fdump-$type" + set compile_type assembly + } } set f [open $src "w"] puts $f $contents @@ -95,6 +100,11 @@ proc check_compile {basename type contents args} { if [regexp "rtl-(.*)" $type dummy rtl_type] { set scan_output "[glob $src.\[0-9\]\[0-9\]\[0-9\]r.$rtl_type]" file delete $output +} else { + if [regexp "tree-(.*)" $type dummy tree_type] { + set scan_output "[glob $src.\[0-9\]\[0-9\]\[0-9\]t.$tree_type]" + file delete $output + } } # Restore additional_sources. @@ -9048,6 +9058,44 @@ proc check_effective_target_autoincdec { } { return 0 } +# Return 1 if the target folds memcpy calls with sizes of count bytes. +# The folding is done by the middle-end but targets have the ability +# to control the maximum size or to disable it altogether. +proc check_effective_target_fold_memcpy { count } { +set result [eval check_compile fold_memcpy tree-optimized { + "void test_fold_memcpy (void *d, const void *s) { __builtin_memcpy (d, s, $count); }" +} "-O2" ] +set lines [lindex $result 0] +set output [lindex $result 1] +set match 1 + +set file [open "$output" ] +set contents [read $file] +close $file + +if { [ regexp "__builtin_memcpy" $contents] } { + # Not folded. + set match 0 +} +remote_file build delete $output +return $match +} + +# Return 1 if the target folds memcpy calls with sizes of 2. +proc check_effective_target_fold_memcpy_2 { } { +return [check_effective_target_fold_memcpy { 2 }] +} + +# Return 1 if the target folds memcpy calls with sizes of 4. +proc check_effective_target_fold_memcpy_4 { } { +return [check_effective_target_fold_memcpy { 4 }] +} + +# Return 1 if the target folds memcpy calls with sizes of 8. +proc check_effective_target_fold_memcpy_8 { } { +return [check_effective_target_fold_memcpy { 8 }] +} + # Return 1 if the target has support for stack probing designed # to avoid stack-clash style attacks. # Index: gcc/testsuite/c-c++-common/Warray-bounds-2.c === --- gcc/testsuite/c-c++-common/Warray-bounds-2.c (revision 268082) +++ gcc/testsuite/c-c++-common/Warray-bounds-2.c (working copy) @@ -8,13 +8,28 @@ { dg-do compile } { dg-options "-O2 -Warray-bounds -Wno-stringop-overflow" } */ -#include -#include +#if __has_include () +# include +#else +/* For cross-compilers. */ +typedef __PTRDIFF_TYPE__ ptrdiff_t; +typedef __SIZE_TYPE__ size_t; +#endif -#undef memcpy -#undef strcpy -#undef strncpy +#if __has_include () +# include +# undef memcpy +# undef strcat +# undef strcpy +#
Re: [PATCH, powerpc] Fix speculation barrier and group nop to emit target register names.
Hi Iain, On Sat, Jan 12, 2019 at 01:28:05PM +, Iain Sandoe wrote: > The current implementation of “speculation_barrier” and “group_end_nop” insns > emit hard-wired register names which causes tests using them to fail on > Darwin, at least, which uses “rNN” instead of “NN”. > > The patch makes the register names for these insns use the operand output > mechanism to substitute the appropriate variant when needed. This is fine for trunk and all backports you may need/want. Thanks, Segher > * config/rs6000/rs6000.md (group_end_nop): Emit > insn register names using operand format, rather than > hard-wired. (speculation_barrier): Likewise. [ Get your mail client not to mess up changelogs? ;-) ]
[PATCH] Fix gcc.dg/utf-array.c testcase
Hi! The utf-array.c testcase FAILs e.g. on i686-linux or powerpc-linux, the problem is that wchar_t there isn't int, but long int. grep shows that WCHAR_TYPE is one of int short int long int unsigned int short unsigned int long unsigned int depending on exact target and options. The following patch accepts them all, ok for trunk? 2019-01-18 Jakub Jelinek * gcc.dg/utf-array.c: Allow wchar_t to be printed as {long ,short ,}{unsigned ,}int. --- gcc/testsuite/gcc.dg/utf-array.c.jj 2019-01-18 00:33:20.867980701 +0100 +++ gcc/testsuite/gcc.dg/utf-array.c2019-01-18 23:32:57.086524528 +0100 @@ -12,13 +12,13 @@ typedef __CHAR32_TYPE__ char32_t; const char s_0[] = "ab"; const char s_1[] = u"ab";/* { dg-error "from a string literal with type array of" } */ const char s_2[] = U"ab";/* { dg-error "from a string literal with type array of" } */ -const char s_3[] = L"ab";/* { dg-error "from a string literal with type array of .int." } */ +const char s_3[] = L"ab";/* { dg-error "from a string literal with type array of .(long |short )?(unsigned )?int." } */ const char s_4[] = u8"ab"; const char16_t s16_0[] = "ab"; /* { dg-error "from a string literal with type array of .char." } */ const char16_t s16_1[] = u"ab"; const char16_t s16_2[] = U"ab";/* { dg-error "from a string literal with type array of" } */ -const char16_t s16_3[] = L"ab";/* { dg-error "from a string literal with type array of .int." "" { target { ! wchar_t_char16_t_compatible } } } */ +const char16_t s16_3[] = L"ab";/* { dg-error "from a string literal with type array of .(long |short )?(unsigned )?int." "" { target { ! wchar_t_char16_t_compatible } } } */ const char16_t s16_4[] = u8"ab"; /* { dg-error "from a string literal with type array of .char." } */ const char16_t s16_5[0] = u"ab"; /* { dg-warning "chars is too long" } */ @@ -30,7 +30,7 @@ const char16_ts16_9[4] = u"ab"; const char32_t s32_0[] = "ab"; /* { dg-error "from a string literal with type array of .char." } */ const char32_t s32_1[] = u"ab";/* { dg-error "from a string literal with type array of" } */ const char32_t s32_2[] = U"ab"; -const char32_t s32_3[] = L"ab";/* { dg-error "from a string literal with type array of .int." "" { target { ! wchar_t_char32_t_compatible } } } */ +const char32_t s32_3[] = L"ab";/* { dg-error "from a string literal with type array of .(long |short )?(unsigned )?int." "" { target { ! wchar_t_char32_t_compatible } } } */ const char32_t s32_4[] = u8"ab"; /* { dg-error "from a string literal with type array of .char." } */ const char32_t s32_5[0] = U"ab"; /* { dg-warning "chars is too long" } */ Jakub
[PATCH] Fix transfer_intrinsic_3.f90 miscompilation on ppc*/s390* (PR tree-optimization/88044)
Hi! As mentioned in the PR, on the transfer_intrinsic_3.f90 testcase at -O3 on a few targets we have in number_of_iterations_cond: code LE_EXPR iv0->base 0 iv0->step 0 iv1->base -1 iv1->step 1 every_iteration false The loop starts with: [local count: 8656061039]: # n_63 = PHI <0(6), _28(23)> _19 = n_63 + -1; and ends with _28 = n_63 + 1; if (_28 == 4) goto ; [12.36%] else goto ; [87.64%] [local count: 7582748748]: goto ; [100.00%] and besides the exit at the end has also: [local count: 3548985018]: if (_19 > 0) goto ; [0.04%] else goto ; [99.96%] [local count: 1419591]: _gfortran_stop_numeric (1, 0); [local count: 5106238449]: if (_19 < 0) goto ; [0.04%] else goto ; [99.96%] [local count: 5104195957]: goto ; [100.00%] [local count: 2042498]: _gfortran_stop_numeric (2, 0); in the middle, so two other loop exits. But, neither bb16, nor bb18 are executed every iteration, if they were, then because _19 is -1 in the first iteration would always stop 2 and not iterate further. We have: /* If the test is not executed every iteration, wrapping may make the test to pass again. TODO: the overflow case can be still used as unreliable estimate of upper bound. But we have no API to pass it down to number of iterations code and, at present, it will not use it anyway. */ if (!every_iteration && (!iv0->no_overflow || !iv1->no_overflow || code == NE_EXPR || code == EQ_EXPR)) return false; at the start, but that doesn't trigger here, because code is not equality comparison and no_overflow is set on both IVs. If there would be an overflow, then maybe it would be right to derive number of iterations from that. But the condition that returns true is that iv0->base code iv1->base is false, if that isn't done in every iteration, it means nothing for the number of iteration analysis. Fixed thusly, bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64{,le}-linux, ok for trunk? 2019-01-18 Jakub Jelinek PR tree-optimization/88044 * tree-ssa-loop-niter.c (number_of_iterations_cond): If condition is false in the first iteration, but !every_iteration, return false instead of true with niter->niter zero. --- gcc/tree-ssa-loop-niter.c.jj2019-01-10 11:43:02.254577008 +0100 +++ gcc/tree-ssa-loop-niter.c 2019-01-18 19:51:00.245504728 +0100 @@ -1824,6 +1824,8 @@ number_of_iterations_cond (struct loop * tree tem = fold_binary (code, boolean_type_node, iv0->base, iv1->base); if (tem && integer_zerop (tem)) { + if (!every_iteration) + return false; niter->niter = build_int_cst (unsigned_type_for (type), 0); niter->max = 0; return true; Jakub
[PATCH] Fix LTO ICEs due to invalid self-referencing fortran character length VAR_DECL chains (PR fortran/88902)
Hi! As the testcase shows, gfc_get_symbol_decl can be called multiple times and we can add the same length multiple times to current or parent function. As addition of a VAR_DECL to those is done by chaining it into the DECL_CHAIN linked list, adding the same VAR_DECL twice means a loop in the chain (in this testcase DECL_CHAIN referencing the containing VAR_DECL, but it could be longer loop). Any such loop is a bug and e.g. LTO is very upset about that. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-01-18 Jakub Jelinek PR fortran/88902 * trans-decl.c (gfc_get_symbol_decl): Don't add length to function or parent function if it has been added there already. * gfortran.dg/pr88902.f90: New test. --- gcc/fortran/trans-decl.c.jj 2019-01-16 09:35:08.0 +0100 +++ gcc/fortran/trans-decl.c2019-01-18 13:03:07.073419557 +0100 @@ -1572,13 +1572,17 @@ gfc_get_symbol_decl (gfc_symbol * sym) if (VAR_P (length) && DECL_FILE_SCOPE_P (length)) { /* Add the string length to the same context as the symbol. */ - if (DECL_CONTEXT (sym->backend_decl) == current_function_decl) - gfc_add_decl_to_function (length); - else - gfc_add_decl_to_parent_function (length); + if (DECL_CONTEXT (length) == NULL_TREE) + { + if (DECL_CONTEXT (sym->backend_decl) + == current_function_decl) + gfc_add_decl_to_function (length); + else + gfc_add_decl_to_parent_function (length); + } - gcc_assert (DECL_CONTEXT (sym->backend_decl) == - DECL_CONTEXT (length)); + gcc_assert (DECL_CONTEXT (sym->backend_decl) + == DECL_CONTEXT (length)); gfc_defer_symbol_init (sym); } --- gcc/testsuite/gfortran.dg/pr88902.f90.jj2019-01-18 12:58:03.738394429 +0100 +++ gcc/testsuite/gfortran.dg/pr88902.f90 2019-01-18 12:59:06.971357361 +0100 @@ -0,0 +1,6 @@ +! PR fortran/88902 +! { dg-do compile } +! { dg-require-effective-target lto } +! { dg-options "-flto --param ggc-min-heapsize=0" } + +include 'pr50069_2.f90' Jakub
[C++ PATCH] Fix -fsanitize=pointer-compare,pointer-subtract ICEs in templates (PR sanitizer/88901)
Hi! When processing_template_decl, all we care about is diagnostics and the return type if it is not dependent; other spots that add sanitization do nothing if processing_template_decl and the following patch does that for the two recently added ones. Without it, save_expr is called on potentially dependent FE expressions the middle-end doesn't handle. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-01-18 Jakub Jelinek PR sanitizer/88901 * typeck.c (cp_build_binary_op): Don't instrument SANITIZE_POINTER_COMPARE if processing_template_decl. (pointer_diff): Similarly for SANITIZE_POINTER_SUBTRACT. * g++.dg/asan/pr88901.C: New test. --- gcc/cp/typeck.c.jj 2019-01-18 09:13:58.580790058 +0100 +++ gcc/cp/typeck.c 2019-01-18 11:53:45.941734135 +0100 @@ -5233,6 +5233,7 @@ cp_build_binary_op (const op_location_t } if ((code0 == POINTER_TYPE || code1 == POINTER_TYPE) + && !processing_template_decl && sanitize_flags_p (SANITIZE_POINTER_COMPARE)) { op0 = save_expr (op0); @@ -5650,7 +5651,8 @@ pointer_diff (location_t loc, tree op0, else inttype = restype; - if (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT)) + if (!processing_template_decl + && sanitize_flags_p (SANITIZE_POINTER_SUBTRACT)) { op0 = save_expr (op0); op1 = save_expr (op1); --- gcc/testsuite/g++.dg/asan/pr88901.C.jj 2019-01-18 11:55:42.398826983 +0100 +++ gcc/testsuite/g++.dg/asan/pr88901.C 2019-01-18 11:55:26.559086374 +0100 @@ -0,0 +1,13 @@ +// PR sanitizer/88901 +// { dg-do compile } +// { dg-options "-fsanitize=address -fsanitize=pointer-compare" } + +template +struct A { + void foo() { +auto d = [](char *x, char *y) { + for (char *p = x; p + sizeof(T) <= y; p += sizeof(T)) +reinterpret_cast(p)->~T(); +}; + } +}; Jakub
Re: [PATCH] rs6000: Add missing prototypes for vec_ld/vec_st
On Wed, Jan 16, 2019 at 10:08:46PM +0800, Kewen.Lin wrote: > * gcc.target/powerpc/altivec_vld_vst_addr.c: New test. This test fails on powerpc64-linux, both with -m32 and -m64: /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:35:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:59:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:83:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:107:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:117:7: error: expected ';' before 'double' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:120:10: error: incompatible types when returning type '__vector double' {aka '__vector(2) double'} but 'double' was expected /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:129:7: error: expected ';' before 'double' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:130:26: warning: type defaults to 'int' in declaration of 'vector' [-Wimplicit-int] /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:130:33: error: expected ';', ',' or ')' before 'double' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:156:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:180:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:204:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:205:7: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:229:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:229:1: error: use of 'long long' in AltiVec types is invalid without '-mvsx' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:247:20: error: unknown type name 'vector' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:253:20: error: unknown type name 'vector' /.../gcc/testsuite/gcc.target/powerpc/altivec_vld_vst_addr.c:253:37: error: unknown type name 'vector' Jakub
Re: [RS6000] PR88614, output_operand: invalid %z value
Hi Alan, On Mon, Jan 07, 2019 at 09:29:18AM +1030, Alan Modra wrote: > The direct cause of this PR is the fact that tls_gdld_nomark didn't > handle indirect calls. Adding the missing support revealed that most > indirect calls were being optimised back to direct calls anyway, due > to tls_gdld_nomark not checking any of the parallel elements except > the first (plus the extra element that distinguishes this call from > normal calls). Just checking the number of elements is enough to > separate the indirect calls from direct for ABI_ELFv2 and ABI_AIX, > while checking for the LONG_CALL bit in the cookie works for ABI_V4. > Direct calls being substituted for indirect calls is not the only > unwanted substitution. See the tls_nomark_call comment. I also saw a > _GLOBAL_OFFSET_TABLE_ symbol_ref being substituted for the GOT reg, > hence the unspec_tls change. > Bootstrap and regression testing on powerpc64le-linux and > powerpc64-linux in progress. Note that the patch requires > https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00252.html or the > earlier version for the attribute support. (Did you commit that yet?) > +;; Verify that elements of the tls_gdld_nomark call insn parallel past the > +;; second element (added to distinguish this call from normal calls) match > +;; the normal contours of a call insn. This is necessary to prevent > +;; substitutions we don't want, for example, an indirect call being > +;; optimised to a direct call, or (set (reg:r2) (unspec [] UNSPEC_TOCSLOT)) > +;; being cleverly optimised to (set (reg:r2) (reg:r2)) because gcc > +;; "knows" that r2 hasn't changed from a previous call. > +(define_predicate "tls_nomark_call" > + (match_code "parallel") > +{ > + int n = XVECLEN (op, 0); > + rtvec v = XVEC (op, 0); > + rtx set = RTVEC_ELT (v, 0); > + if (GET_CODE (set) != SET) > +return 0; > + rtx call = XEXP (set, 1); > + if (GET_CODE (call) != CALL) > +return 0; > + rtx mem = XEXP (call, 0); > + if (GET_CODE (mem) != MEM) > +return 0; > + rtx addr = XEXP (mem, 0); > + if (GET_CODE (addr) == SYMBOL_REF) > +{ > + if (DEFAULT_ABI == ABI_ELFv2 || DEFAULT_ABI == ABI_AIX) > + return (n == 3 && GET_CODE (RTVEC_ELT (v, 2)) == CLOBBER > + && REG_P (XEXP (RTVEC_ELT (v, 2), 0)) > + && REGNO (XEXP (RTVEC_ELT (v, 2), 0)) == LR_REGNO); > + else if (DEFAULT_ABI == ABI_V4) > + return (n >= 4 && n <= 5 && GET_CODE (RTVEC_ELT (v, 2)) == USE > + && CONST_INT_P (XEXP (RTVEC_ELT (v, 2), 0)) > + && (INTVAL (XEXP (RTVEC_ELT (v, 2), 0)) & CALL_LONG) == 0 > + && (n == 4 > + || (GET_CODE (RTVEC_ELT (v, 3)) == USE > + && REG_P (XEXP (RTVEC_ELT (v, 3), 0 > + && GET_CODE (RTVEC_ELT (v, n - 1)) == CLOBBER > + && REG_P (XEXP (RTVEC_ELT (v, n - 1), 0)) > + && REGNO (XEXP (RTVEC_ELT (v, n - 1), 0)) == LR_REGNO); > + else > + gcc_unreachable (); > +} > + else if (indirect_call_operand (addr, mode)) > +{ > + if (DEFAULT_ABI == ABI_ELFv2) > + return (n == 4 && GET_CODE (RTVEC_ELT (v, 2)) == SET > + && REG_P (XEXP (RTVEC_ELT (v, 2), 0)) > + && REGNO (XEXP (RTVEC_ELT (v, 2), 0)) == TOC_REGNUM > + && GET_CODE (XEXP (RTVEC_ELT (v, 2), 1)) == UNSPEC > + && XINT (XEXP (RTVEC_ELT (v, 2), 1), 1) == UNSPEC_TOCSLOT > + && XVECLEN (XEXP (RTVEC_ELT (v, 2), 1), 0) == 1 > + && CONST_INT_P (XVECEXP (XEXP (RTVEC_ELT (v, 2), 1), 0, 0)) > + && GET_CODE (RTVEC_ELT (v, 3)) == CLOBBER > + && REG_P (XEXP (RTVEC_ELT (v, 3), 0)) > + && REGNO (XEXP (RTVEC_ELT (v, 3), 0)) == LR_REGNO); > + else if (DEFAULT_ABI == ABI_AIX) > + return (n == 5 && GET_CODE (RTVEC_ELT (v, 2)) == USE > + && GET_CODE (XEXP (RTVEC_ELT (v, 2), 0)) == MEM > + && GET_CODE (RTVEC_ELT (v, 3)) == SET > + && REG_P (XEXP (RTVEC_ELT (v, 3), 0)) > + && REGNO (XEXP (RTVEC_ELT (v, 3), 0)) == TOC_REGNUM > + && GET_CODE (XEXP (RTVEC_ELT (v, 3), 1)) == UNSPEC > + && XINT (XEXP (RTVEC_ELT (v, 3), 1), 1) == UNSPEC_TOCSLOT > + && XVECLEN (XEXP (RTVEC_ELT (v, 3), 1), 0) == 1 > + && CONST_INT_P (XVECEXP (XEXP (RTVEC_ELT (v, 3), 1), 0, 0)) > + && GET_CODE (RTVEC_ELT (v, 4)) == CLOBBER > + && REG_P (XEXP (RTVEC_ELT (v, 4), 0)) > + && REGNO (XEXP (RTVEC_ELT (v, 4), 0)) == LR_REGNO); > + else if (DEFAULT_ABI == ABI_V4) > + return (n == 4 && GET_CODE (RTVEC_ELT (v, 2)) == USE > + && CONST_INT_P (XEXP (RTVEC_ELT (v, 2), 0)) > + && GET_CODE (RTVEC_ELT (v, 3)) == CLOBBER > + && REG_P (XEXP (RTVEC_ELT (v, 3), 0)) > + && REGNO (XEXP (RTVEC_ELT (v, 3), 0)) == LR_REGNO); > + else > + gcc_unreachable (); > +} > + else > +return 0; > +}) I find things like this almost
ChangeLog formatting nits
Hi! On Thu, Jan 17, 2019 at 07:30:34AM +0100, Thomas Koenig wrote: > 2019-01-17 Thomas Koenig > > PR fortran/88871 > * resolve.c (resolve_ref): Fix logic for removal of > reference. Just a few ChangeLog formatting nits of what I've seen recently from multiple people, there should be exactly one space after ):, not two, and there should be no space between ) and :, or, if there is just filename, between filename and :, so * filename.whatever : New test. is not correct and it should be * filename.whatever: New test. Similarly: * filename.whatever : External declarations for foo and blah. etc. should be * filename.whatever (foo, blah): Declare. or similar. Thanks. Jakub
[PATCH] PR libstdc++/88782 avoid ODR problems in std::make_shared
The old version of _Sp_counted_ptr_inplace::_M_get_deleter (up to GCC 8.2.0) expects to be passed a real std::typeinfo object, so mixing that with the new definition of the __shared_ptr constructor (which always passes the fake tag) leads to accessing the fake object as a real std::typeinfo. Instead of trying to make it safe to mix the old and new definitions, just stop using that function. By passing a reference to __shared_ptr::_M_ptr to the __shared_count constructor it can be set directly, without needing to obtain the pointer via the _M_get_deleter back-channel. This avoids a virtual dispatch (which fixes PR 87514). This means that code built against new libstdc++ headers doesn't use _M_get_deleter at all, and so make_shared works the same whether RTTI is enabled or not. Also change _M_get_deleter so that it checks for a real type_info object even when RTTI is disabled, by calling a library function. Unless libstdc++ itself is built without RTTI that library function will be able to test if it's the right type_info. This means the new definition of _M_get_deleter can handle both the fake type_info tag and a real type_info object, even if built without RTTI. If linking to objects built against older versions of libstdc++ then if all objects use -frtti or all use -fno-rtti, then the caller of _M_get_deleter and the definition of _M_get_deleter will be consistent and it will work. If mixing -frtti with -fno-rtti it can still fail if the linker picks an old definition of _M_get_deleter and an old __shared_ptr constructor that are incompatible. In that some or all objects might need to be recompiled. PR libstdc++/87514 PR libstdc++/87520 PR libstdc++/88782 * config/abi/pre/gnu.ver (GLIBCXX_3.4.26): Export new symbol. * include/bits/shared_ptr.h (shared_ptr(_Sp_make_shared_tag, const Alloc&, Args&&...)) (allocate_shared): Change to use new tag type. * include/bits/shared_ptr_base.h (_Sp_make_shared_tag::_S_eq): Declare new member function. (_Sp_alloc_shared_tag): Define new type. (_Sp_counted_ptr_inplace): Declare __shared_count<_Lp> as a friend. (_Sp_counted_ptr_inplace::_M_get_deleter) [!__cpp_rtti]: Use _Sp_make_shared_tag::_S_eq to check type_info. (__shared_count(Ptr, Deleter),__shared_count(Ptr, Deleter, Alloc)): Constrain to prevent being called with _Sp_alloc_shared_tag. (__shared_count(_Sp_make_shared_tag, const _Alloc&, Args&&...)): Replace constructor with ... (__shared_count(Tp*&, _Sp_alloc_shared_tag<_Alloc>, Args&&...)): Use reference parameter so address of the new object can be returned to the caller. Obtain the allocator from the tag type. (__shared_ptr(_Sp_make_shared_tag, const Alloc&, Args&&...)): Replace constructor with ... (__shared_ptr(_Sp_alloc_shared_tag, Args&&...)): Pass _M_ptr to the __shared_count constructor. (__allocate_shared): Change to use new tag type. * src/c++11/shared_ptr.cc (_Sp_make_shared_tag::_S_eq): Define. Tested powerpc64le-linux, committed to trunk. I'll backport this to gcc-8-branch without the new symbol in the library. commit f4334034ec92d0a6cdf6dc3244a25108f32fc89a Author: Jonathan Wakely Date: Thu Jan 17 20:46:09 2019 + PR libstdc++/88782 avoid ODR problems in std::make_shared The old version of _Sp_counted_ptr_inplace::_M_get_deleter (up to GCC 8.2.0) expects to be passed a real std::typeinfo object, so mixing that with the new definition of the __shared_ptr constructor (which always passes the fake tag) leads to accessing the fake object as a real std::typeinfo. Instead of trying to make it safe to mix the old and new definitions, just stop using that function. By passing a reference to __shared_ptr::_M_ptr to the __shared_count constructor it can be set directly, without needing to obtain the pointer via the _M_get_deleter back-channel. This avoids a virtual dispatch (which fixes PR 87514). This means that code built against new libstdc++ headers doesn't use _M_get_deleter at all, and so make_shared works the same whether RTTI is enabled or not. Also change _M_get_deleter so that it checks for a real type_info object even when RTTI is disabled, by calling a library function. Unless libstdc++ itself is built without RTTI that library function will be able to test if it's the right type_info. This means the new definition of _M_get_deleter can handle both the fake type_info tag and a real type_info object, even if built without RTTI. If linking to objects built against older versions of libstdc++ then if all objects use -frtti or all use -fno-rtti, then the caller of _M_get_deleter and the definition of _M_get_deleter will be consistent and it will work. If mixing -frtti with -fno-rtti it can still fail if the linker
[PATCH] Fix leak in splay-tree
Philippe Waroquiers noticed a memory leak in gdb, which he tracked down to a bug in splay-tree. splay_tree_remove does not call the `delete_key' function when it removes the old node; but it should. I looked at every splay tree in GCC and there is only one that passes a non-NULL delete function -- the one in lto.c. That file does not call splay_tree_remove. So, I think this is safe to check in. I re-ran the LTO tests to double check. libiberty/ * splay-tree.c (splay_tree_remove): Delete the key if necessary. --- libiberty/ChangeLog| 4 libiberty/splay-tree.c | 2 ++ 2 files changed, 6 insertions(+) diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog index bcc0227bdd8..1eb25f928f2 100644 --- a/libiberty/ChangeLog +++ b/libiberty/ChangeLog @@ -1,3 +1,7 @@ +2019-01-18 Tom Tromey + + * splay-tree.c (splay_tree_remove): Delete the key if necessary. + 2019-01-14 Tom Honermann * cp-demangle.c (cplus_demangle_builtin_types) diff --git a/libiberty/splay-tree.c b/libiberty/splay-tree.c index 920e68db2cb..21d23c38dfc 100644 --- a/libiberty/splay-tree.c +++ b/libiberty/splay-tree.c @@ -425,6 +425,8 @@ splay_tree_remove (splay_tree sp, splay_tree_key key) right = sp->root->right; /* Delete the root node itself. */ + if (sp->delete_key) + (*sp->delete_key) (sp->root->key); if (sp->delete_value) (*sp->delete_value) (sp->root->value); (*sp->deallocate) (sp->root, sp->allocate_data); -- 2.17.2
[C++ PATCH] PR c++/88875 - error with explicit list constructor.
In my patch for CWG issue 2267, I changed reference_binding to clear CONSTRUCTOR_IS_DIRECT_INIT on the argument init-list. But that breaks if there's another candidate for which CONSTRUCTOR_IS_DIRECT_INIT is correct. So instead, let's encode in the conversion that we want to override the flag. Tested x86_64-pc-linux-gnu, applying to trunk. * call.c (reference_binding): Don't modify EXPR. Set need_temporary_p on the ck_user conversion for a temporary. (convert_like_real): Check it. --- gcc/cp/call.c | 11 +++--- .../g++.dg/cpp0x/initlist-explicit2.C | 20 +++ gcc/cp/ChangeLog | 7 +++ 3 files changed, 35 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-explicit2.C diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 499894b353f..16c3706cc5c 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -94,7 +94,7 @@ struct conversion { BOOL_BITFIELD bad_p : 1; /* If KIND is ck_ref_bind ck_base_conv, true to indicate that a temporary should be created to hold the result of the - conversion. If KIND is ck_ambig, true if the context is + conversion. If KIND is ck_ambig or ck_user, true means force copy-initialization. */ BOOL_BITFIELD need_temporary_p : 1; /* If KIND is ck_ptr or ck_pmem, true to indicate that a conversion @@ -1560,6 +1560,7 @@ reference_binding (tree rto, tree rfrom, tree expr, bool c_cast_p, int flags, from = TREE_TYPE (expr); } + bool copy_list_init = false; if (expr && BRACE_ENCLOSED_INITIALIZER_P (expr)) { maybe_warn_cpp0x (CPP0X_INITIALIZER_LISTS); @@ -1582,7 +1583,7 @@ reference_binding (tree rto, tree rfrom, tree expr, bool c_cast_p, int flags, /* Otherwise, if T is a reference type, a prvalue temporary of the type referenced by T is copy-list-initialized, and the reference is bound to that temporary. */ - CONSTRUCTOR_IS_DIRECT_INIT (expr) = false; + copy_list_init = true; skip:; } @@ -1770,6 +1771,10 @@ reference_binding (tree rto, tree rfrom, tree expr, bool c_cast_p, int flags, if (conv->user_conv_p) { + if (copy_list_init) + /* Remember this was copy-list-initialization. */ + conv->need_temporary_p = true; + /* If initializing the temporary used a conversion function, recalculate the second conversion sequence. */ for (conversion *t = conv; t; t = next_conversion (t)) @@ -6941,7 +6946,7 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, if (DECL_NONCONVERTING_P (convfn) && DECL_CONSTRUCTOR_P (convfn) && BRACE_ENCLOSED_INITIALIZER_P (expr) /* Unless this is for direct-list-initialization. */ - && !CONSTRUCTOR_IS_DIRECT_INIT (expr) + && (!CONSTRUCTOR_IS_DIRECT_INIT (expr) || convs->need_temporary_p) /* And in C++98 a default constructor can't be explicit. */ && cxx_dialect >= cxx11) { diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-explicit2.C b/gcc/testsuite/g++.dg/cpp0x/initlist-explicit2.C new file mode 100644 index 000..26a63bf2aa7 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/initlist-explicit2.C @@ -0,0 +1,20 @@ +// PR c++/88875 +// { dg-do compile { target c++11 } } + +#include + +struct X { + X(); + explicit X(const std::initializer_list& init); +}; + +struct Y +{ + X x { 1, 2 }; // error + + Y (int) +: x {1, 2} // ok + { + } + +}; diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog index d224b72c0bb..4292930daf3 100644 --- a/gcc/cp/ChangeLog +++ b/gcc/cp/ChangeLog @@ -1,3 +1,10 @@ +2019-01-18 Jason Merrill + + PR c++/88875 - error with explicit list constructor. + * call.c (reference_binding): Don't modify EXPR. Set + need_temporary_p on the ck_user conversion for a temporary. + (convert_like_real): Check it. + 2019-01-18 H.J. Lu PR c/51628 base-commit: 31975c5ea11cee1a66a59e6d941db4c6b3cc602c -- 2.20.1
Re: [PATCH,Fortran][RFC] PR 87939, 87326 - STAT= and ERRMSG= specifiers in several image control statements; NEW_INDEX= specifier in FORM TEAM statement
Hi Steve, URLs: done Copyright assignment: in progress. Thanks for the heads up regarding the wait. -- Nathan On Fri, Jan 18, 2019 at 1:27 PM Steve Kargl wrote: > > Nathan, > > Can you add URLs in the bug reports to your patch so > that it doesn't get lost? The copyright assignment > can take longer than one might think. > > -- > steve > > On Fri, Jan 18, 2019 at 01:17:03PM -0600, Nathan Weeks wrote: > > I made a mistake in the ChangeLogs: libgfortran.h is in gcc/fortran, > > and libcaf.h is in libgfortran/caf. Also, the additional enumerations > > in those headers don't go all the way in adding support for > > STAT_UNLOCKED_FAILED_IMAGE to ISO_FORTRAN_ENV itself (it looks like > > that would be at least a minor modification to > > gcc/fortran/iso-fortran-env.def, and documentation update in > > gcc/fortran/intrinsic.texi, and perhaps a test). I've updated the > > ChangeLogs to clarify all of this. > > > > -- > > Nathan > > > > frontend: > > > > 2019-01-16 Nathan Weeks > > > > PR fortran/87939 > > PR fortran/87326 > > * gfortran.h: Add an additional gfc_expr member to struct gfc_code. > > * libgfortran.h: Add GFC_STAT_UNLOCKED_FAILED_IMAGE. > > * match.c (gfc_match_critical): Add STAT= and ERRMSG=. > > (gfc_match_change_team): Likewise. > > (gfc_match_end_team): Likewise. > > (gfc_match_sync_team): Likewise. > > (gfc_match_form_team): Add STAT=, ERRMSG=, and NEW_INDEX=. > > * resolve.c (resolve_form_team): New. Type check team-variable > > argument in > > addition to new STAT= and ERRMSG= arguments. > > (resolve_change_sync_team): New. Adds type checking for team-value > > argument. > > (resolve_end_team): New. > > (resolve_critical): Add STAT= and ERRMSG=. > > * trans-decl.c (gfc_build_builtin_function_decls): Additional stat, > > errmsg, and errmsg_len arguments to _gfortran_caf_form_team(), > > _gfortran_caf_change_team(), _gfortran_caf_end_team(), and > > _gfortran_caf_sync_team(), and additional new_index argument to > > _gfortran_caf_form_team(). > > * trans-stmt.c (gfc_trans_form_team): Support STAT=, ERRMSG=, and > > NEW_INDEX=. > > (gfc_trans_change_team): Support STAT= and ERRMSG=. > > (gfc_trans_end_team): Likewise. > > (gfc_trans_sync_team): Likewise. > > (gfc_trans_critical): Likewise. Also support assigning > > STAT_FAILED_IMAGE > > to a stat-variable. > > > > libgfortran: > > > > 2019-01-16 Nathan Weeks > > > > PR fortran/87939 > > * caf/libcaf.h: Add CAF_STAT_FAILED_IMAGE. > > > > testsuite: > > > > 2019-01-16 Nathan Weeks > > > > PR fortran/87939 > > PR fortran/87326 > > * gfortran.dg/coarray_critical_2.f90: New test > > * gfortran.dg/coarray_critical_3.f90: New test > > * gfortran.dg/coarray_critical_4.f90: New test > > * gfortran.dg/team_change_2.f90: New test > > * gfortran.dg/team_change_3.f90: New test > > * gfortran.dg/team_end_2.f90: New test > > * gfortran.dg/team_end_3.f90: New test > > * gfortran.dg/team_form_2.f90: New test > > * gfortran.dg/team_form_3.f90: New test > > * gfortran.dg/team_sync_1.f90: New test > > * gfortran.dg/team_sync_2.f90: New test > > > > -- > > Nathan > > > > On Wed, Jan 16, 2019 at 6:16 PM Nathan Weeks wrote: > > > > > > Hi all, > > > > > > To facilitate more complete Fortran 2018 failed images support, I'm > > > particularly interested in interested in seeing PR 87939 eventually > > > resolved (i.e., allow STAT= and ERRMSG= specifiers in FORM TEAM, > > > CHANGE TEAM, SYNC TEAM, END TEAM, and CRITICAL statements). To get the > > > ball rolling (I realize that the boat has been missed for this kind of > > > change in GCC 9 trunk), I've attempted the following patch (which, > > > since it was convenient to do while modifying FORM TEAM-related code, > > > also adds the NEW_INDEX= specifier to the FORM TEAM statement as > > > desired in PR 87326). > > > > > > This is the first gfortran patch I've attempted, and I certainly could > > > have made some noob mistakes, so verbose feedback would be > > > appreciated. > > > > > > A few comments: > > > > > > * In resolve.c, the newly-added functions that type check STAT= and > > > ERRMSG= arguments for FORM TEAM, CHANGE TEAM, and SYNC TEAM also add > > > (previously-absent) type checking for their TEAM_TYPE arguments. If > > > it's more appropriate, I could separate this change into its own PR. > > > > > > * The existing -fcoarray=lib implementation of CRITICAL acquires a > > > LOCK on a lock variable on image 1 (in the current team). However, a > > > CRITICAL statement stat-value of STAT_FAILED_IMAGE (i.e., the image > > > that enter the CRITICAL construct failed) is analogous to the LOCK > > > stat-value of STAT_UNLOCKED_FAILED_IMAGE (i.e., the image that > > >
Re: [PATCH,Fortran][RFC] PR 87939, 87326 - STAT= and ERRMSG= specifiers in several image control statements; NEW_INDEX= specifier in FORM TEAM statement
Nathan, Can you add URLs in the bug reports to your patch so that it doesn't get lost? The copyright assignment can take longer than one might think. -- steve On Fri, Jan 18, 2019 at 01:17:03PM -0600, Nathan Weeks wrote: > I made a mistake in the ChangeLogs: libgfortran.h is in gcc/fortran, > and libcaf.h is in libgfortran/caf. Also, the additional enumerations > in those headers don't go all the way in adding support for > STAT_UNLOCKED_FAILED_IMAGE to ISO_FORTRAN_ENV itself (it looks like > that would be at least a minor modification to > gcc/fortran/iso-fortran-env.def, and documentation update in > gcc/fortran/intrinsic.texi, and perhaps a test). I've updated the > ChangeLogs to clarify all of this. > > -- > Nathan > > frontend: > > 2019-01-16 Nathan Weeks > > PR fortran/87939 > PR fortran/87326 > * gfortran.h: Add an additional gfc_expr member to struct gfc_code. > * libgfortran.h: Add GFC_STAT_UNLOCKED_FAILED_IMAGE. > * match.c (gfc_match_critical): Add STAT= and ERRMSG=. > (gfc_match_change_team): Likewise. > (gfc_match_end_team): Likewise. > (gfc_match_sync_team): Likewise. > (gfc_match_form_team): Add STAT=, ERRMSG=, and NEW_INDEX=. > * resolve.c (resolve_form_team): New. Type check team-variable > argument in > addition to new STAT= and ERRMSG= arguments. > (resolve_change_sync_team): New. Adds type checking for team-value > argument. > (resolve_end_team): New. > (resolve_critical): Add STAT= and ERRMSG=. > * trans-decl.c (gfc_build_builtin_function_decls): Additional stat, > errmsg, and errmsg_len arguments to _gfortran_caf_form_team(), > _gfortran_caf_change_team(), _gfortran_caf_end_team(), and > _gfortran_caf_sync_team(), and additional new_index argument to > _gfortran_caf_form_team(). > * trans-stmt.c (gfc_trans_form_team): Support STAT=, ERRMSG=, and > NEW_INDEX=. > (gfc_trans_change_team): Support STAT= and ERRMSG=. > (gfc_trans_end_team): Likewise. > (gfc_trans_sync_team): Likewise. > (gfc_trans_critical): Likewise. Also support assigning > STAT_FAILED_IMAGE > to a stat-variable. > > libgfortran: > > 2019-01-16 Nathan Weeks > > PR fortran/87939 > * caf/libcaf.h: Add CAF_STAT_FAILED_IMAGE. > > testsuite: > > 2019-01-16 Nathan Weeks > > PR fortran/87939 > PR fortran/87326 > * gfortran.dg/coarray_critical_2.f90: New test > * gfortran.dg/coarray_critical_3.f90: New test > * gfortran.dg/coarray_critical_4.f90: New test > * gfortran.dg/team_change_2.f90: New test > * gfortran.dg/team_change_3.f90: New test > * gfortran.dg/team_end_2.f90: New test > * gfortran.dg/team_end_3.f90: New test > * gfortran.dg/team_form_2.f90: New test > * gfortran.dg/team_form_3.f90: New test > * gfortran.dg/team_sync_1.f90: New test > * gfortran.dg/team_sync_2.f90: New test > > -- > Nathan > > On Wed, Jan 16, 2019 at 6:16 PM Nathan Weeks wrote: > > > > Hi all, > > > > To facilitate more complete Fortran 2018 failed images support, I'm > > particularly interested in interested in seeing PR 87939 eventually > > resolved (i.e., allow STAT= and ERRMSG= specifiers in FORM TEAM, > > CHANGE TEAM, SYNC TEAM, END TEAM, and CRITICAL statements). To get the > > ball rolling (I realize that the boat has been missed for this kind of > > change in GCC 9 trunk), I've attempted the following patch (which, > > since it was convenient to do while modifying FORM TEAM-related code, > > also adds the NEW_INDEX= specifier to the FORM TEAM statement as > > desired in PR 87326). > > > > This is the first gfortran patch I've attempted, and I certainly could > > have made some noob mistakes, so verbose feedback would be > > appreciated. > > > > A few comments: > > > > * In resolve.c, the newly-added functions that type check STAT= and > > ERRMSG= arguments for FORM TEAM, CHANGE TEAM, and SYNC TEAM also add > > (previously-absent) type checking for their TEAM_TYPE arguments. If > > it's more appropriate, I could separate this change into its own PR. > > > > * The existing -fcoarray=lib implementation of CRITICAL acquires a > > LOCK on a lock variable on image 1 (in the current team). However, a > > CRITICAL statement stat-value of STAT_FAILED_IMAGE (i.e., the image > > that enter the CRITICAL construct failed) is analogous to the LOCK > > stat-value of STAT_UNLOCKED_FAILED_IMAGE (i.e., the image that > > acquired the lock failed---see section 11.6.11 (7 & 10) in Fortran > > 2018 draft N2146), whereas a LOCK STAT_FAILED_IMAGE means the image on > > which the lock variable resides has failed (no analog in the CRITICAL > > statement, which is oblivious to this underlying implementation). So > > in addition to adding the stat value STAT_UNLOCKED_FAILED_IMAGE to > > libgfortran.h &
[PATCH][gcc] libgccjit: introduce gcc_jit_context_add_driver_option
Hi all, this patch add gcc_jit_context_add_driver_option to the libgccjit ABI and a testcase for it. Using this interface is now possible to pass options affecting assembler and linker. Does not introduce any new regression running make check-jit. Bests Andrea gcc/jit/ChangeLog 2019-01-16 Andrea Corallo andrea.cora...@arm.com * docs/topics/compatibility.rst (LIBGCCJIT_ABI_11): New ABI tag. * docs/topics/contexts.rst (Additional driver options): New section. * jit-playback.c (invoke_driver): Add call to append_driver_options. * jit-recording.c: Within namespace gcc::jit... (recording::context::~context): Free the optnames within m_driver_options. (recording::context::add_driver_option): New method. (recording::context::append_driver_options): New method. (recording::context::dump_reproducer_to_file): Add driver options. * jit-recording.h: Within namespace gcc::jit... (recording::context::add_driver_option): New method. (recording::context::append_driver_options): New method. (recording::context::m_driver_options): New field. * libgccjit++.h (gccjit::context::add_driver_option): New method. * libgccjit.c (gcc_jit_context_add_driver_option): New API entrypoint. * libgccjit.h (gcc_jit_context_add_driver_option): New API entrypoint. (LIBGCCJIT_HAVE_gcc_jit_context_add_driver_option): New macro. * libgccjit.map (LIBGCCJIT_ABI_11): New ABI tag. gcc/testsuite/ChangeLog 2019-01-16 Andrea Corallo andrea.cora...@arm.com * jit.dg/add-driver-options-testlib.c: Add support file for test-add-driver-options.c testcase. * jit.dg/all-non-failing-tests.h: Add test-add-driver-options.c * jit.dg/jit.exp (jit-dg-test): Update to support add-driver-options-testlib.c compilation. * jit.dg/test-add-driver-options.c: New testcase. diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst index 38d338b..abefa56 100644 --- a/gcc/jit/docs/topics/compatibility.rst +++ b/gcc/jit/docs/topics/compatibility.rst @@ -168,6 +168,12 @@ entrypoints: ``LIBGCCJIT_ABI_10`` - ``LIBGCCJIT_ABI_10`` covers the addition of :func:`gcc_jit_context_new_rvalue_from_vector` + +.. _LIBGCCJIT_ABI_11: + +``LIBGCCJIT_ABI_11`` + +``LIBGCCJIT_ABI_11`` covers the addition of +:func:`gcc_jit_context_add_driver_option` diff --git a/gcc/jit/docs/topics/contexts.rst b/gcc/jit/docs/topics/contexts.rst index 95964ca..2f8aeb7 100644 --- a/gcc/jit/docs/topics/contexts.rst +++ b/gcc/jit/docs/topics/contexts.rst @@ -546,3 +546,36 @@ Additional command-line options .. code-block:: c #ifdef LIBGCCJIT_HAVE_gcc_jit_context_add_command_line_option + +.. function:: void gcc_jit_context_add_driver_option (gcc_jit_context *ctxt,\ + const char *optname) + + Add an arbitrary gcc driver option to the context, for use by + :func:`gcc_jit_context_compile` and + :func:`gcc_jit_context_compile_to_file`. + + The parameter ``optname`` must be non-NULL. The underlying buffer is + copied, so that it does not need to outlive the call. + + Extra options added by `gcc_jit_context_add_driver_option` are + applied *after* all other options potentially overriding them. + Options from parent contexts are inherited by child contexts; options + from the parent are applied *before* those from the child. + + For example: + + .. code-block:: c + + gcc_jit_context_add_driver_option (ctxt, "-lm"); + gcc_jit_context_add_driver_option (ctxt, "-fuse-linker-plugin"); + + Note that only some options are likely to be meaningful; there is no + "frontend" within libgccjit, so typically only those affecting + assembler and linker are likely to be useful. + + This entrypoint was added in :ref:`LIBGCCJIT_ABI_11`; you can test for + its presence using + + .. code-block:: c + + #ifdef LIBGCCJIT_HAVE_gcc_jit_context_add_driver_option diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c index 86f588d..b74495c 100644 --- a/gcc/jit/jit-playback.c +++ b/gcc/jit/jit-playback.c @@ -2459,6 +2459,10 @@ invoke_driver (const char *ctxt_progname, if (0) ADD_ARG ("-v"); + /* Add any user-provided driver extra options. */ + + m_recording_ctxt->append_driver_options (); + #undef ADD_ARG /* pex_one's error-handling requires pname to be non-NULL. */ diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h index b9c6544..b9f2250 100644 --- a/gcc/jit/jit-recording.h +++ b/gcc/jit/jit-recording.h @@ -218,6 +218,12 @@ public: append_command_line_options (vec *argvec); void + add_driver_option (const char *optname); + + void + append_driver_options (auto_string_vec *argvec); + + void enable_dump (const char *dumpname, char **out_ptr); @@ -317,6 +323,7 @@ private: bool m_bool_options[GCC_JIT_NUM_BOOL_OPTIONS]; bool m_inner_bool_options[NUM_INNER_BOOL_OPTIONS]; auto_vec m_command_line_options; + auto_vec m_driver_options; /* Dumpfiles that were requested via gcc_jit_context_enable_dump. */
Re: Claw back some of the code size regression in 548.exchange2_r
On January 18, 2019 5:37:46 PM GMT+01:00, Richard Sandiford wrote: >This patch tries harder to detect cases in which the inner dimension >of an array access is invariant, such as: > > x(i, :) = 100 > >It fixes some of the code size regression in 548.exchange2_r, with >size improving by 5% compared to before the patch. Of the two other >SPEC 2017 tests affected by loop versioning, 554.roms_r improved by a >trivial amount (0.3%) and 549.fotonik3d_r didn't change. All three >results are with -Ofast -flto. > >Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. >OK to install? OK. Richard. >Richard > > >2019-01-18 Richard Sandiford > >gcc/ > * gimple-loop-versioning.cc (loop_versioning::dump_inner_likelihood): > New function, split out from... > (loop_versioning::analyze_stride): ...here. > (loop_versioning::find_per_loop_multiplication): Use gassign. > (loop_versioning::analyze_term_using_scevs): Return a success code. > (loop_versioning::analyze_arbitrary_term): New function. > (loop_versioning::analyze_address_fragment): Use > analyze_arbitrary_term if all else fails. > >gcc/testsuite/ > * gfortran.dg/loop_versioning_1.f90: Bump the number of identified > inner strides. > * gfortran.dg/loop_versioning_9.f90: New test. > * gfortran.dg/loop_versioning_10.f90: Likewise. > >Index: gcc/gimple-loop-versioning.cc >=== >--- gcc/gimple-loop-versioning.cc 2019-01-04 11:39:25.918257505 + >+++ gcc/gimple-loop-versioning.cc 2019-01-18 16:36:13.172064883 + >@@ -294,10 +294,12 @@ private: > bool acceptable_type_p (tree, unsigned HOST_WIDE_INT *); > bool multiply_term_by (address_term_info &, tree); > inner_likelihood get_inner_likelihood (tree, unsigned HOST_WIDE_INT); >+ void dump_inner_likelihood (address_info &, address_term_info &); > void analyze_stride (address_info &, address_term_info &, > tree, struct loop *); >bool find_per_loop_multiplication (address_info &, address_term_info >&); >- void analyze_term_using_scevs (address_info &, address_term_info &); >+ bool analyze_term_using_scevs (address_info &, address_term_info &); >+ void analyze_arbitrary_term (address_info &, address_term_info &); > void analyze_address_fragment (address_info &); > void record_address_fragment (gimple *, unsigned HOST_WIDE_INT, > tree, unsigned HOST_WIDE_INT, HOST_WIDE_INT); >@@ -803,6 +805,24 @@ loop_versioning::get_inner_likelihood (t > return unlikely_p ? INNER_UNLIKELY : INNER_DONT_KNOW; > } > >+/* Dump the likelihood that TERM's stride is for the innermost >dimension. >+ ADDRESS is the address that contains TERM. */ >+ >+void >+loop_versioning::dump_inner_likelihood (address_info , >+ address_term_info ) >+{ >+ if (term.inner_likelihood == INNER_LIKELY) >+dump_printf_loc (MSG_NOTE, address.stmt, "%T is likely to be the" >+ " innermost dimension\n", term.stride); >+ else if (term.inner_likelihood == INNER_UNLIKELY) >+dump_printf_loc (MSG_NOTE, address.stmt, "%T is probably not the" >+ " innermost dimension\n", term.stride); >+ else >+dump_printf_loc (MSG_NOTE, address.stmt, "cannot tell whether %T" >+ " is the innermost dimension\n", term.stride); >+} >+ > /* The caller has identified that STRIDE is the stride of interest >in TERM, and that the stride is applied in OP_LOOP. Record this >information in TERM, deciding whether STRIDE is likely to be for >@@ -818,17 +838,7 @@ loop_versioning::analyze_stride (address > >term.inner_likelihood = get_inner_likelihood (stride, term.multiplier); > if (dump_enabled_p ()) >-{ >- if (term.inner_likelihood == INNER_LIKELY) >- dump_printf_loc (MSG_NOTE, address.stmt, "%T is likely to be the" >- " innermost dimension\n", stride); >- else if (term.inner_likelihood == INNER_UNLIKELY) >- dump_printf_loc (MSG_NOTE, address.stmt, "%T is probably not the" >- " innermost dimension\n", stride); >- else >- dump_printf_loc (MSG_NOTE, address.stmt, "cannot tell whether %T" >- " is the innermost dimension\n", stride); >-} >+dump_inner_likelihood (address, term); > > /* To be a versioning opportunity we require: > >@@ -879,7 +889,7 @@ bool > loop_versioning::find_per_loop_multiplication (address_info , > address_term_info ) > { >- gimple *mult = maybe_get_assign (term.expr); >+ gassign *mult = maybe_get_assign (term.expr); > if (!mult || gimple_assign_rhs_code (mult) != MULT_EXPR) > return false; > >@@ -909,7 +919,7 @@ loop_versioning::find_per_loop_multiplic > } > > /* Try to use scalar evolutions to find an address stride for TERM, >- which belongs to ADDRESS. >+ which
Re: [PATCH,Fortran][RFC] PR 87939, 87326 - STAT= and ERRMSG= specifiers in several image control statements; NEW_INDEX= specifier in FORM TEAM statement
I made a mistake in the ChangeLogs: libgfortran.h is in gcc/fortran, and libcaf.h is in libgfortran/caf. Also, the additional enumerations in those headers don't go all the way in adding support for STAT_UNLOCKED_FAILED_IMAGE to ISO_FORTRAN_ENV itself (it looks like that would be at least a minor modification to gcc/fortran/iso-fortran-env.def, and documentation update in gcc/fortran/intrinsic.texi, and perhaps a test). I've updated the ChangeLogs to clarify all of this. -- Nathan frontend: 2019-01-16 Nathan Weeks PR fortran/87939 PR fortran/87326 * gfortran.h: Add an additional gfc_expr member to struct gfc_code. * libgfortran.h: Add GFC_STAT_UNLOCKED_FAILED_IMAGE. * match.c (gfc_match_critical): Add STAT= and ERRMSG=. (gfc_match_change_team): Likewise. (gfc_match_end_team): Likewise. (gfc_match_sync_team): Likewise. (gfc_match_form_team): Add STAT=, ERRMSG=, and NEW_INDEX=. * resolve.c (resolve_form_team): New. Type check team-variable argument in addition to new STAT= and ERRMSG= arguments. (resolve_change_sync_team): New. Adds type checking for team-value argument. (resolve_end_team): New. (resolve_critical): Add STAT= and ERRMSG=. * trans-decl.c (gfc_build_builtin_function_decls): Additional stat, errmsg, and errmsg_len arguments to _gfortran_caf_form_team(), _gfortran_caf_change_team(), _gfortran_caf_end_team(), and _gfortran_caf_sync_team(), and additional new_index argument to _gfortran_caf_form_team(). * trans-stmt.c (gfc_trans_form_team): Support STAT=, ERRMSG=, and NEW_INDEX=. (gfc_trans_change_team): Support STAT= and ERRMSG=. (gfc_trans_end_team): Likewise. (gfc_trans_sync_team): Likewise. (gfc_trans_critical): Likewise. Also support assigning STAT_FAILED_IMAGE to a stat-variable. libgfortran: 2019-01-16 Nathan Weeks PR fortran/87939 * caf/libcaf.h: Add CAF_STAT_FAILED_IMAGE. testsuite: 2019-01-16 Nathan Weeks PR fortran/87939 PR fortran/87326 * gfortran.dg/coarray_critical_2.f90: New test * gfortran.dg/coarray_critical_3.f90: New test * gfortran.dg/coarray_critical_4.f90: New test * gfortran.dg/team_change_2.f90: New test * gfortran.dg/team_change_3.f90: New test * gfortran.dg/team_end_2.f90: New test * gfortran.dg/team_end_3.f90: New test * gfortran.dg/team_form_2.f90: New test * gfortran.dg/team_form_3.f90: New test * gfortran.dg/team_sync_1.f90: New test * gfortran.dg/team_sync_2.f90: New test -- Nathan On Wed, Jan 16, 2019 at 6:16 PM Nathan Weeks wrote: > > Hi all, > > To facilitate more complete Fortran 2018 failed images support, I'm > particularly interested in interested in seeing PR 87939 eventually > resolved (i.e., allow STAT= and ERRMSG= specifiers in FORM TEAM, > CHANGE TEAM, SYNC TEAM, END TEAM, and CRITICAL statements). To get the > ball rolling (I realize that the boat has been missed for this kind of > change in GCC 9 trunk), I've attempted the following patch (which, > since it was convenient to do while modifying FORM TEAM-related code, > also adds the NEW_INDEX= specifier to the FORM TEAM statement as > desired in PR 87326). > > This is the first gfortran patch I've attempted, and I certainly could > have made some noob mistakes, so verbose feedback would be > appreciated. > > A few comments: > > * In resolve.c, the newly-added functions that type check STAT= and > ERRMSG= arguments for FORM TEAM, CHANGE TEAM, and SYNC TEAM also add > (previously-absent) type checking for their TEAM_TYPE arguments. If > it's more appropriate, I could separate this change into its own PR. > > * The existing -fcoarray=lib implementation of CRITICAL acquires a > LOCK on a lock variable on image 1 (in the current team). However, a > CRITICAL statement stat-value of STAT_FAILED_IMAGE (i.e., the image > that enter the CRITICAL construct failed) is analogous to the LOCK > stat-value of STAT_UNLOCKED_FAILED_IMAGE (i.e., the image that > acquired the lock failed---see section 11.6.11 (7 & 10) in Fortran > 2018 draft N2146), whereas a LOCK STAT_FAILED_IMAGE means the image on > which the lock variable resides has failed (no analog in the CRITICAL > statement, which is oblivious to this underlying implementation). So > in addition to adding the stat value STAT_UNLOCKED_FAILED_IMAGE to > libgfortran.h & libcaf.h, I had CRITICAL swap a LOCK > STAT_UNLOCKED_FAILED_IMAGE for STAT_FAILED_IMAGE, and (perhaps > unimaginatively) a LOCK STAT_FAILED_IMAGE for > STAT_UNLOCKED_FAILED_IMAGE (which, while it has no defined meaning for > a CRITICAL statement, fits the definition of a "processor-dependent > value other than STAT_FAILED_IMAGE"). > > * A couple negative tests for syntax errors (coarray_critical_2.f90 & > team_end_2.f90) fail due to spurious "Error:
libgo patch committed: Update to Go1.12beta2
I have committed a patch to update libgo to the Go 1.12beta2 release. As usual this sort of update is too large to include all changes in this e-mail. I've included changes to gccgo-specific files below. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian gotools/ 2019-01-18 Ian Lance Taylor * Makefile.am (go_cmd_vet_files): Update for Go1.12beta2 release. (GOTOOLS_TEST_TIMEOUT): Increase to 600. (check-runtime): Export LD_LIBRARY_PATH before computing GOARCH and GOOS. (check-vet): Copy golang.org/x/tools into check-vet-dir. * Makefile.in: Regenerate. gcc/testsuite/ 2019-01-18 Ian Lance Taylor * go.go-torture/execute/names-1.go: Stop using debug/xcoff, which is no longer externally visible. Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 268078) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -d16e9181a760796802c067730bb030b92b63fb2c +c76ba3014e42cc6adc3d43709bba28c5ad7a6ba2 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gotools/Makefile.am === --- gotools/Makefile.am (revision 268078) +++ gotools/Makefile.am (working copy) @@ -70,31 +70,8 @@ go_cmd_cgo_files = \ $(cmdsrcdir)/cgo/util.go go_cmd_vet_files = \ - $(cmdsrcdir)/vet/asmdecl.go \ - $(cmdsrcdir)/vet/assign.go \ - $(cmdsrcdir)/vet/atomic.go \ - $(cmdsrcdir)/vet/bool.go \ - $(cmdsrcdir)/vet/buildtag.go \ - $(cmdsrcdir)/vet/cgo.go \ - $(cmdsrcdir)/vet/composite.go \ - $(cmdsrcdir)/vet/copylock.go \ - $(cmdsrcdir)/vet/deadcode.go \ - $(cmdsrcdir)/vet/dead.go \ $(cmdsrcdir)/vet/doc.go \ - $(cmdsrcdir)/vet/httpresponse.go \ - $(cmdsrcdir)/vet/lostcancel.go \ - $(cmdsrcdir)/vet/main.go \ - $(cmdsrcdir)/vet/method.go \ - $(cmdsrcdir)/vet/nilfunc.go \ - $(cmdsrcdir)/vet/print.go \ - $(cmdsrcdir)/vet/rangeloop.go \ - $(cmdsrcdir)/vet/shadow.go \ - $(cmdsrcdir)/vet/shift.go \ - $(cmdsrcdir)/vet/structtag.go \ - $(cmdsrcdir)/vet/tests.go \ - $(cmdsrcdir)/vet/types.go \ - $(cmdsrcdir)/vet/unsafeptr.go \ - $(cmdsrcdir)/vet/unused.go + $(cmdsrcdir)/vet/main.go go_cmd_buildid_files = \ $(cmdsrcdir)/buildid/buildid.go \ @@ -163,7 +140,7 @@ uninstall-local: GOTESTFLAGS = # Number of seconds before tests time out. -GOTOOLS_TEST_TIMEOUT = 480 +GOTOOLS_TEST_TIMEOUT = 600 # Run tests using the go tool, and frob the output to look like that # generated by DejaGNU. The main output of this is two files: @@ -256,6 +233,7 @@ check-runtime: go$(EXEEXT) $(noinst_PROG $(MKDIR_P) check-runtime-dir @abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \ LD_LIBRARY_PATH=`echo $${abs_libgodir}/.libs:$${LD_LIBRARY_PATH} | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; \ + export LD_LIBRARY_PATH; \ GOARCH=`$(abs_builddir)/go$(EXEEXT) env GOARCH`; \ GOOS=`$(abs_builddir)/go$(EXEEXT) env GOOS`; \ files=`$(SHELL) $(libgosrcdir)/../match.sh --goarch=$${GOARCH} --goos=$${GOOS} --srcdir=$(libgosrcdir)/runtime --extrafiles="$(libgodir)/runtime_sysinfo.go $(libgodir)/sigtab.go" --tag=libffi`; \ @@ -299,10 +277,11 @@ check-carchive-test: go$(EXEEXT) $(noins # check-vet runs `go test cmd/vet` in our environment. check-vet: go$(EXEEXT) $(noinst_PROGRAMS) check-head check-gccgo check-gcc rm -rf check-vet-dir cmd_vet-testlog - $(MKDIR_P) check-vet-dir/src/cmd/internal + $(MKDIR_P) check-vet-dir/src/cmd/internal check-vet-dir/src/cmd/vendor/golang.org/x cp -r $(cmdsrcdir)/vet check-vet-dir/src/cmd/ cp -r $(cmdsrcdir)/internal/objabi check-vet-dir/src/cmd/internal cp $(libgodir)/objabi.go check-vet-dir/src/cmd/internal/objabi/ + cp -r $(libgosrcdir)/golang.org/x/tools check-vet-dir/src/cmd/vendor/golang.org/x/ @abs_libgodir=`cd $(libgodir) && $(PWD_COMMAND)`; \ abs_checkdir=`cd check-vet-dir && $(PWD_COMMAND)`; \ echo "cd check-vet-dir/src/cmd/vet && $(ECHO_ENV) GOPATH=$${abs_checkdir} $(abs_builddir)/go$(EXEEXT) test -test.short -test.timeout=$(GOTOOLS_TEST_TIMEOUT)s -test.v" > cmd_vet-testlog Index: gcc/testsuite/go.go-torture/execute/names-1.go === --- gcc/testsuite/go.go-torture/execute/names-1.go (revision 268078) +++ gcc/testsuite/go.go-torture/execute/names-1.go (working copy) @@ -7,9 +7,9 @@ import ( "debug/elf" "debug/macho" "debug/pe" - "debug/xcoff" "fmt" "os" + "runtime" "strings" ) @@ -61,6 +61,12 @@ func Function3(out *bytes.Buffer) { } func main() { + if runtime.GOOS == "aix" { + // Not supported on AIX until there is an
[PATCH] String contents hash map key example
I thought it would be useful to others who are new to the GCC codebase to have an example of how to hash keys based on string contents rather than pointer addresses (the fact that some hash maps key based on semi-reliable pointers (due to ggc_mark_stringpool) into the symtab gives a false sense of security). - Michael From 3433efe4ac558de05410a9b185f4ff0a01e7e5df Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Fri, 11 Jan 2019 09:22:14 -0500 Subject: [PATCH] Document how to hash based on key string contents. gcc: 2019-01-18 Michael Ploujnikov * hash-map-tests.c (test_map_of_strings_to_int): Document how to hash based on key string contents. --- gcc/hash-map-tests.c | 20 1 file changed, 20 insertions(+) diff --git gcc/hash-map-tests.c gcc/hash-map-tests.c index 98b5830497..61da8233c4 100644 --- gcc/hash-map-tests.c +++ gcc/hash-map-tests.c @@ -77,6 +77,26 @@ test_map_of_strings_to_int () m.remove (eric); ASSERT_EQ (5, m.elements ()); ASSERT_EQ (NULL, m.get (eric)); + + /* A plain char * key is hashed based on its value (address), rather + than the string it points to. */ + char *another_ant = static_cast (xcalloc (4, 1)); + another_ant[0] = 'a'; + another_ant[1] = 'n'; + another_ant[2] = 't'; + another_ant[3] = 0; + ASSERT_NE (ant, another_ant); + unsigned prev_size = m.elements (); + ASSERT_EQ (false, m.put (another_ant, 7)); + ASSERT_EQ (prev_size + 1, m.elements ()); + + /* Need to use string_hash or nofree_string_hash key types to hash + based on the string contents. */ + hash_map string_map; + ASSERT_EQ (false, string_map.put (ant, 1)); + ASSERT_EQ (1, string_map.elements ()); + ASSERT_EQ (true, string_map.put (another_ant, 5)); + ASSERT_EQ (1, string_map.elements ()); } /* Run all of the selftests within this file. */ -- 2.19.1 signature.asc Description: OpenPGP digital signature
Re: [EXT] Re: [Patch 2/4][Aarch64] v2: Implement Aarch64 SIMD ABI
On Fri, 2019-01-18 at 15:35 +0100, Christophe Lyon wrote: > > Hi Steve, > > I've noticed that > FAIL: g++.dg/vect/simd-clone-7.cc -std=c++14 (test for warnings, > line 7) > (and for c++17 and c++98) > when forcing -mabi=ilp32. > > I suspect you want to skip the test in this case? > > Christophe Actually, I think we can compile that test, it just would not generate a warning in ILP32 mode because int, floats and pointers would now all be the same size. So I think the fix is: % git diff simd-clone-7.cc diff --git a/gcc/testsuite/g++.dg/vect/simd-clone-7.cc b/gcc/testsuite/g++.dg/vect/simd-clone-7.cc index c2a63cd5f8e..3617f0ab6a7 100644 --- a/gcc/testsuite/g++.dg/vect/simd-clone-7.cc +++ b/gcc/testsuite/g++.dg/vect/simd-clone-7.cc @@ -8,4 +8,4 @@ bar (float x, float *y, int) { return y[0] + y[1] * x; } -// { dg-warning "GCC does not currently support mixed size types for 'simd' functions" "" { target aarch64-*-* } .-4 } +// { dg-warning "GCC does not currently support mixed size types for 'simd' functions" "" { target { { aarch64-*-* } && lp64 } } .-4 } I haven't tested this, I don't have an ILP32 build sitting around right now. Does it work for you? I can build a toolchain, test it, and submit a patch if you want. Steve Ellcey sell...@marvell.com
[Patch, Fortran] PR 37835 -fno-automatic does not work for derived types with default initalizer
Hi all! The patch for gcc/fortran/resolve.c is the modernized version of Paul’s patch in comment 4. It causes some regressions due to "Duplicate SAVE » warnings. They are silenced by the patch for gcc/fortran/symbol.c unless -pedantic is used as documented in the change for gcc/fortran/invoke.texi. Is it OK for trunk? TIA Dominique 2019-01-18 Dominique d'Humieres PR fortran/37835 * resolve.c (resolve_types): Add !flag_automatic. * symbol.c (gfc_add_save): Silence warnings. 2019-01-18 Dominique d'Humieres PR fortran/37835 * gfortran.dg/no-automatic.f90: New test. patch-37835 Description: Binary data
[PATCH] [ARC] atomics: Add operand to DMB instruction
Atomics use DMB instruction to enforce ordering of loads/stores. Currently gcc generates DMB w/o any arg which is a no-op. Fix that by generating DMB 3 which enforces R+W ordering. It is stricter than what acq/rel expect, but there's no other way. gcc/ 2019-01-18 Vineet Gupta * config/arc/atomic.md: Add operand to DMB instruction Signed-off-by: Vineet Gupta --- gcc/ChangeLog| 4 gcc/config/arc/atomic.md | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 13890776cc08..09051b816cae 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,7 @@ +2019-01-18 Vineet Gupta + + * config/arc/atomic.md: Add operand to DMB instruction + 2019-01-18 Richard Biener PR tree-optimization/88903 diff --git a/gcc/config/arc/atomic.md b/gcc/config/arc/atomic.md index 562c79a6578e..fe767dfedd5c 100644 --- a/gcc/config/arc/atomic.md +++ b/gcc/config/arc/atomic.md @@ -44,7 +44,7 @@ { if (TARGET_HS) { - return "dmb"; + return "dmb\\t3"; } else { -- 2.7.4
Re: libbacktrace patch RFC: check size passed to backtrace_get_view
On Fri, Jan 18, 2019 at 8:18 AM Tom de Vries wrote: > > On 18-01-19 16:40, Ian Lance Taylor wrote: > > int > > backtrace_get_view (struct backtrace_state *state ATTRIBUTE_UNUSED, > > - int descriptor, off_t offset, size_t size, > > + int descriptor, off_t offset, uint64_t size, > > backtrace_error_callback error_callback, > > void *data, struct backtrace_view *view) > > { > > @@ -60,6 +60,12 @@ backtrace_get_view (struct backtrace_sta > >off_t pageoff; > >void *map; > > > > + if ((uint64_t) (size_t) size != size) > > +{ > > + error_callback (data, "file size too large", 0); > > + return 0; > > +} > > + > > Agreed, this will fix the PR. Thanks. Committed to mainline. > There's a cornercase I'm not sure is worth bothering about, but given > that this is an RFC: in the case of 32-bit systems with 32-bit > filesystem, there will be a range of numbers that fit in size_t, but are > too large for off_t (both 32-bit but size_t unsigned and off_t signed), > so in that case, the file size is too large, but we're not detecting > that here. Though I think that should be handled in the subsequent mmap > (or, in the case of read.c, in the subsequent read(), though I'm > guessing the earlier backtrace_alloc > 2GB will already fail). Yeah, I'm not worried about that case. A system with a signed 32-bit off_t can't really support files larger than 2G anyhow, since for larger files the struct stat st_size field will be negative. Ian
[PATCH] rs6000: Fix *movsi_from_df (PR88892)
The memory store instructions (stfs[u][x], stxssp[x]) can result in garbage if the value to be stored isn't already a valid single precision floating point number. So we cannot use this here. This needs backporting to 8, according to the PR. Tested etc.; committing to trunk. Segher 2019-01-18 Segher Boessenkool PR target/88892 * config/rs6000/rs6000.md (*movsi_from_df): Allow only register operands. --- gcc/config/rs6000/rs6000.md | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 13970f3..d59b46f 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7015,24 +7015,19 @@ (define_insn_and_split "*movdi_from_sf_zero_ext" 8, 4")]) ;; Like movsi_from_sf, but combine a convert from DFmode to SFmode before -;; moving it to SImode. We can do a SFmode store without having to do the -;; conversion explicitly. If we are doing a register->register conversion, use -;; XSCVDPSP instead of XSCVDPSPN, since the former handles cases where the -;; input will not fit in a SFmode, and the later assumes the value has already -;; been rounded. +;; moving it to SImode. We cannot do a SFmode store without having to do the +;; conversion explicitly since that doesn't work in most cases if the input +;; isn't representable as SF. Use XSCVDPSP instead of XSCVDPSPN, since the +;; former handles cases where the input will not fit in a SFmode, and the +;; latter assumes the value has already been rounded. (define_insn "*movsi_from_df" - [(set (match_operand:SI 0 "nonimmediate_operand" "=wa,m,wY,Z") + [(set (match_operand:SI 0 "gpc_reg_operand" "=wa") (unspec:SI [(float_truncate:SF -(match_operand:DF 1 "gpc_reg_operand" "wa, f,wb,wa"))] +(match_operand:DF 1 "gpc_reg_operand" "wa"))] UNSPEC_SI_FROM_SF))] - "TARGET_NO_SF_SUBREG" - "@ - xscvdpsp %x0,%x1 - stfs%U0%X0 %1,%0 - stxssp %1,%0 - stxsspx %x1,%y0" - [(set_attr "type" "fp,fpstore,fpstore,fpstore")]) + "xscvdpsp %x0,%x1" + [(set_attr "type" "fp")]) ;; Split a load of a large constant into the appropriate two-insn ;; sequence. -- 1.8.3.1
Re: [patch, fortran] Fix PR 88871
Am 17.01.19 um 07:30 schrieb Thomas Koenig: No test case because, well - it did show up on a few systems, so we will notice if it regresses. OK for trunk? Both Dominique and Jürgen confirmed that it fixes the PR. I will commit this tomorrow unless there are any objections - I'd like to get this out of the way. Regards Thomas
Re: [PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)
On Fri, 2019-01-18 at 12:32 -0500, David Malcolm wrote: [CCing Abel] > PR rtl-optimization/88423 reports an ICE within sched-ebb.c's > begin_move_insn, failing the assertion at line 175, where there's > no fall-through edge: > > 171 rtx_insn *x = NEXT_INSN (insn); > 172 if (e) > 173 gcc_checking_assert (NOTE_P (x) || LABEL_P (x)); > 174 else > 175 gcc_checking_assert (BARRIER_P (x)); > > "insn" is a jump_insn for a table jump, and its NEXT_INSN is the > placeholder code_label, followed by the jump_table_data. > > It's not clear to me if such a jump_insn can be repositioned within > the insn stream, or if the scheduler is able to do so. I believe a > tablejump is always at the end of such a head/tail insn sub-stream. > Is it a requirement that the placeholder code_label for the jump_insn > is always its NEXT_INSN? > > The loop at the start of schedule_ebb adjusts the head and tail > of the insns to be scheduled so that it skips leading and trailing > notes > and debug insns. > > This patch adjusts that loop to also skip trailing jump_insn > instances > that are table jumps, so that we don't attempt to move such table > jumps. > > This fixes the ICE, but I'm not very familiar with this part of the > compiler - so I have two questions: > > (1) What does GCC mean by "ebb" in this context? > > My understanding is that the normal definition of an "extended basic > block" (e.g. Muchnick's book pp175-177) is that it's a maximal > grouping > of basic blocks where only one BB in each group has multiple in-edges > and all other BBs in the group have a single in-edge (and thus e.g. > there's a tree-like structure of BBs within each EBB). > > From what I can tell, schedule_ebbs is iterating over BBs, looking > for > runs of BBs joined by next_bb that are connected by fallthrough edges > and don't have labels (and aren't flagged with BB_DISABLE_SCHEDULE). > It uses this run of BBs to generate a run of instructions within the > NEXT_INSN/PREV_INSN doubly-linked list, which it passes as "head" > and "tail" to schedule_ebb. > > This sounds like it will be a group of basic blocks with single in- > edges > internally, but it isn't a *maximal* group of such BBs - but perhaps > it's "maximal" in the sense of what the NEXT_INSN/PREV_INSN > representation can cope with? > > There (presumably) can't be a fallthrough edge after a table jump, so > a table jump could only ever be at the end of such a chain, never in > the > middle. > > (2) Is it OK to omit "tail" from consideration here, from a dataflow > and insn-dependency point-of-view? Presumably the scheduler is > written > to ensure that data used by subsequent basic blocks will still be > available > after the insns within an "EBB" are reordered, so presumably any data > uses *within* the jump_insn are still going to be available - but, as > I > said, I'm not very familiar with this part of the code. (I think I'm > also > assuming that the jump_insn can't clobber data, just the PC) > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > OK for trunk? > > gcc/ChangeLog: > PR rtl-optimization/88423 > * sched-ebb.c (schedule_ebb): Don't move the jump_insn for a > table > jump. > > gcc/testsuite/ChangeLog: > PR rtl-optimization/88423 > * gcc.c-torture/compile/pr88423.c: New test. > --- > gcc/sched-ebb.c | 4 > gcc/testsuite/gcc.c-torture/compile/pr88423.c | 5 + > 2 files changed, 9 insertions(+) > create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr88423.c > > diff --git a/gcc/sched-ebb.c b/gcc/sched-ebb.c > index d459e09..1fe0b76 100644 > --- a/gcc/sched-ebb.c > +++ b/gcc/sched-ebb.c > @@ -485,6 +485,10 @@ schedule_ebb (rtx_insn *head, rtx_insn *tail, > bool modulo_scheduling) > tail = PREV_INSN (tail); >else if (LABEL_P (head)) > head = NEXT_INSN (head); > + else if (tablejump_p (tail, NULL, NULL)) > + /* Don't move a jump_insn for a tablejump, to avoid having > +to move the placeholder code_label and jump_table_data. > */ > + tail = PREV_INSN (tail); >else > break; > } > diff --git a/gcc/testsuite/gcc.c-torture/compile/pr88423.c > b/gcc/testsuite/gcc.c-torture/compile/pr88423.c > new file mode 100644 > index 000..4948817 > --- /dev/null > +++ b/gcc/testsuite/gcc.c-torture/compile/pr88423.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */ > +/* { dg-options "-march=skylake -fPIC -fsched2-use-superblocks -fno- > if-conversion" } */ > +/* { dg-require-effective-target fpic } */ > + > +#include "../../gcc.dg/20030309-1.c"
C++ PATCH to add test for c++/86926
This test got fixed by r267859, so let's add it. Unfortunately, it still ICEs on the gcc-8 branch. Tested on x86_64-linux, applying to trunk. 2019-01-18 Marek Polacek PR c++/86926 * g++.dg/cpp1z/constexpr-lambda23.C: New test. diff --git gcc/testsuite/g++.dg/cpp1z/constexpr-lambda23.C gcc/testsuite/g++.dg/cpp1z/constexpr-lambda23.C new file mode 100644 index 000..4ff866b5d94 --- /dev/null +++ gcc/testsuite/g++.dg/cpp1z/constexpr-lambda23.C @@ -0,0 +1,16 @@ +// PR c++/86926 +// { dg-do compile { target c++17 } } + +int +main() +{ +constexpr auto f = [](auto self, auto n) { +if(n < 2) + return n; +return self(self, n - 1) + self(self, n - 2); +}; + +constexpr auto fibonacci = [=](auto n) { return f(f, n); }; + +static_assert(fibonacci(7) == 13); +}
[PATCH] sched-ebb.c: avoid moving table jumps (PR rtl-optimization/88423)
PR rtl-optimization/88423 reports an ICE within sched-ebb.c's begin_move_insn, failing the assertion at line 175, where there's no fall-through edge: 171 rtx_insn *x = NEXT_INSN (insn); 172 if (e) 173 gcc_checking_assert (NOTE_P (x) || LABEL_P (x)); 174 else 175 gcc_checking_assert (BARRIER_P (x)); "insn" is a jump_insn for a table jump, and its NEXT_INSN is the placeholder code_label, followed by the jump_table_data. It's not clear to me if such a jump_insn can be repositioned within the insn stream, or if the scheduler is able to do so. I believe a tablejump is always at the end of such a head/tail insn sub-stream. Is it a requirement that the placeholder code_label for the jump_insn is always its NEXT_INSN? The loop at the start of schedule_ebb adjusts the head and tail of the insns to be scheduled so that it skips leading and trailing notes and debug insns. This patch adjusts that loop to also skip trailing jump_insn instances that are table jumps, so that we don't attempt to move such table jumps. This fixes the ICE, but I'm not very familiar with this part of the compiler - so I have two questions: (1) What does GCC mean by "ebb" in this context? My understanding is that the normal definition of an "extended basic block" (e.g. Muchnick's book pp175-177) is that it's a maximal grouping of basic blocks where only one BB in each group has multiple in-edges and all other BBs in the group have a single in-edge (and thus e.g. there's a tree-like structure of BBs within each EBB). >From what I can tell, schedule_ebbs is iterating over BBs, looking for runs of BBs joined by next_bb that are connected by fallthrough edges and don't have labels (and aren't flagged with BB_DISABLE_SCHEDULE). It uses this run of BBs to generate a run of instructions within the NEXT_INSN/PREV_INSN doubly-linked list, which it passes as "head" and "tail" to schedule_ebb. This sounds like it will be a group of basic blocks with single in-edges internally, but it isn't a *maximal* group of such BBs - but perhaps it's "maximal" in the sense of what the NEXT_INSN/PREV_INSN representation can cope with? There (presumably) can't be a fallthrough edge after a table jump, so a table jump could only ever be at the end of such a chain, never in the middle. (2) Is it OK to omit "tail" from consideration here, from a dataflow and insn-dependency point-of-view? Presumably the scheduler is written to ensure that data used by subsequent basic blocks will still be available after the insns within an "EBB" are reordered, so presumably any data uses *within* the jump_insn are still going to be available - but, as I said, I'm not very familiar with this part of the code. (I think I'm also assuming that the jump_insn can't clobber data, just the PC) Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. OK for trunk? gcc/ChangeLog: PR rtl-optimization/88423 * sched-ebb.c (schedule_ebb): Don't move the jump_insn for a table jump. gcc/testsuite/ChangeLog: PR rtl-optimization/88423 * gcc.c-torture/compile/pr88423.c: New test. --- gcc/sched-ebb.c | 4 gcc/testsuite/gcc.c-torture/compile/pr88423.c | 5 + 2 files changed, 9 insertions(+) create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr88423.c diff --git a/gcc/sched-ebb.c b/gcc/sched-ebb.c index d459e09..1fe0b76 100644 --- a/gcc/sched-ebb.c +++ b/gcc/sched-ebb.c @@ -485,6 +485,10 @@ schedule_ebb (rtx_insn *head, rtx_insn *tail, bool modulo_scheduling) tail = PREV_INSN (tail); else if (LABEL_P (head)) head = NEXT_INSN (head); + else if (tablejump_p (tail, NULL, NULL)) + /* Don't move a jump_insn for a tablejump, to avoid having + to move the placeholder code_label and jump_table_data. */ + tail = PREV_INSN (tail); else break; } diff --git a/gcc/testsuite/gcc.c-torture/compile/pr88423.c b/gcc/testsuite/gcc.c-torture/compile/pr88423.c new file mode 100644 index 000..4948817 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr88423.c @@ -0,0 +1,5 @@ +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options "-march=skylake -fPIC -fsched2-use-superblocks -fno-if-conversion" } */ +/* { dg-require-effective-target fpic } */ + +#include "../../gcc.dg/20030309-1.c" -- 1.8.5.3
Claw back some of the code size regression in 548.exchange2_r
This patch tries harder to detect cases in which the inner dimension of an array access is invariant, such as: x(i, :) = 100 It fixes some of the code size regression in 548.exchange2_r, with size improving by 5% compared to before the patch. Of the two other SPEC 2017 tests affected by loop versioning, 554.roms_r improved by a trivial amount (0.3%) and 549.fotonik3d_r didn't change. All three results are with -Ofast -flto. Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. OK to install? Richard 2019-01-18 Richard Sandiford gcc/ * gimple-loop-versioning.cc (loop_versioning::dump_inner_likelihood): New function, split out from... (loop_versioning::analyze_stride): ...here. (loop_versioning::find_per_loop_multiplication): Use gassign. (loop_versioning::analyze_term_using_scevs): Return a success code. (loop_versioning::analyze_arbitrary_term): New function. (loop_versioning::analyze_address_fragment): Use analyze_arbitrary_term if all else fails. gcc/testsuite/ * gfortran.dg/loop_versioning_1.f90: Bump the number of identified inner strides. * gfortran.dg/loop_versioning_9.f90: New test. * gfortran.dg/loop_versioning_10.f90: Likewise. Index: gcc/gimple-loop-versioning.cc === --- gcc/gimple-loop-versioning.cc 2019-01-04 11:39:25.918257505 + +++ gcc/gimple-loop-versioning.cc 2019-01-18 16:36:13.172064883 + @@ -294,10 +294,12 @@ private: bool acceptable_type_p (tree, unsigned HOST_WIDE_INT *); bool multiply_term_by (address_term_info &, tree); inner_likelihood get_inner_likelihood (tree, unsigned HOST_WIDE_INT); + void dump_inner_likelihood (address_info &, address_term_info &); void analyze_stride (address_info &, address_term_info &, tree, struct loop *); bool find_per_loop_multiplication (address_info &, address_term_info &); - void analyze_term_using_scevs (address_info &, address_term_info &); + bool analyze_term_using_scevs (address_info &, address_term_info &); + void analyze_arbitrary_term (address_info &, address_term_info &); void analyze_address_fragment (address_info &); void record_address_fragment (gimple *, unsigned HOST_WIDE_INT, tree, unsigned HOST_WIDE_INT, HOST_WIDE_INT); @@ -803,6 +805,24 @@ loop_versioning::get_inner_likelihood (t return unlikely_p ? INNER_UNLIKELY : INNER_DONT_KNOW; } +/* Dump the likelihood that TERM's stride is for the innermost dimension. + ADDRESS is the address that contains TERM. */ + +void +loop_versioning::dump_inner_likelihood (address_info , + address_term_info ) +{ + if (term.inner_likelihood == INNER_LIKELY) +dump_printf_loc (MSG_NOTE, address.stmt, "%T is likely to be the" +" innermost dimension\n", term.stride); + else if (term.inner_likelihood == INNER_UNLIKELY) +dump_printf_loc (MSG_NOTE, address.stmt, "%T is probably not the" +" innermost dimension\n", term.stride); + else +dump_printf_loc (MSG_NOTE, address.stmt, "cannot tell whether %T" +" is the innermost dimension\n", term.stride); +} + /* The caller has identified that STRIDE is the stride of interest in TERM, and that the stride is applied in OP_LOOP. Record this information in TERM, deciding whether STRIDE is likely to be for @@ -818,17 +838,7 @@ loop_versioning::analyze_stride (address term.inner_likelihood = get_inner_likelihood (stride, term.multiplier); if (dump_enabled_p ()) -{ - if (term.inner_likelihood == INNER_LIKELY) - dump_printf_loc (MSG_NOTE, address.stmt, "%T is likely to be the" -" innermost dimension\n", stride); - else if (term.inner_likelihood == INNER_UNLIKELY) - dump_printf_loc (MSG_NOTE, address.stmt, "%T is probably not the" -" innermost dimension\n", stride); - else - dump_printf_loc (MSG_NOTE, address.stmt, "cannot tell whether %T" -" is the innermost dimension\n", stride); -} +dump_inner_likelihood (address, term); /* To be a versioning opportunity we require: @@ -879,7 +889,7 @@ bool loop_versioning::find_per_loop_multiplication (address_info , address_term_info ) { - gimple *mult = maybe_get_assign (term.expr); + gassign *mult = maybe_get_assign (term.expr); if (!mult || gimple_assign_rhs_code (mult) != MULT_EXPR) return false; @@ -909,7 +919,7 @@ loop_versioning::find_per_loop_multiplic } /* Try to use scalar evolutions to find an address stride for TERM, - which belongs to ADDRESS. + which belongs to ADDRESS. Return true and update TERM if so. Here we are interested in any evolution information we can find, not just evolutions wrt
Re: libbacktrace patch RFC: check size passed to backtrace_get_view
On 18-01-19 16:40, Ian Lance Taylor wrote: > int > backtrace_get_view (struct backtrace_state *state ATTRIBUTE_UNUSED, > - int descriptor, off_t offset, size_t size, > + int descriptor, off_t offset, uint64_t size, > backtrace_error_callback error_callback, > void *data, struct backtrace_view *view) > { > @@ -60,6 +60,12 @@ backtrace_get_view (struct backtrace_sta >off_t pageoff; >void *map; > > + if ((uint64_t) (size_t) size != size) > +{ > + error_callback (data, "file size too large", 0); > + return 0; > +} > + Agreed, this will fix the PR. There's a cornercase I'm not sure is worth bothering about, but given that this is an RFC: in the case of 32-bit systems with 32-bit filesystem, there will be a range of numbers that fit in size_t, but are too large for off_t (both 32-bit but size_t unsigned and off_t signed), so in that case, the file size is too large, but we're not detecting that here. Though I think that should be handled in the subsequent mmap (or, in the case of read.c, in the subsequent read(), though I'm guessing the earlier backtrace_alloc > 2GB will already fail). Thanks, - Tom
PING^1: V3 [PATCH] i386: Add pass_remove_partial_avx_dependency
On Mon, Jan 7, 2019 at 5:55 AM H.J. Lu wrote: > > On Sun, Dec 30, 2018 at 8:40 AM H.J. Lu wrote: > > > > On Wed, Nov 28, 2018 at 12:17 PM Jeff Law wrote: > > > > > > On 11/28/18 12:48 PM, H.J. Lu wrote: > > > > On Mon, Nov 5, 2018 at 7:29 AM Jan Hubicka wrote: > > > >> > > > >>> On 11/5/18 7:21 AM, Jan Hubicka wrote: > > > > > > > > Did you mean "the nearest common dominator"? > > > > > > If the nearest common dominator appears in the loop while all uses > > > are > > > out of loops, this will result in suboptimal xor placement. > > > In this case you want to split edges out of the loop. > > > > > > In general this is what the LCM framework will do for you if the > > > problem > > > is modelled siimlar way as in mode_swtiching. At entry function > > > mode is > > > "no zero register needed" and all conversions need mode "zero > > > register > > > needed". Mode switching should then do the correct placement > > > decisions > > > (reaching minimal number of executions of xor). > > > > > > Jeff, whan is your optinion on the approach taken by the patch? > > > It seems like a special case of more general issue, but I do not see > > > very elegant way to solve it at least in the GCC 9 horisont, so if > > > the placement is correct we can probalby go either with new pass or > > > making this part of mode swithcing (which is anyway run by x86 > > > backend) > > > >>> So I haven't followed this discussion at all, but did touch on this > > > >>> issue with some patch a month or two ago with a target patch that was > > > >>> trying to avoid the partial stalls. > > > >>> > > > >>> My assumption is that we're trying to find one or more places to > > > >>> initialize the upper half of an avx register so as to avoid partial > > > >>> register stall at existing sites that set the upper half. > > > >>> > > > >>> This sounds like a classic PRE/LCM style problem (of which mode > > > >>> switching is just another variant). A common-dominator approach is > > > >>> closer to a classic GCSE and is going to result is more > > > >>> initializations > > > >>> at sub-optimal points than a PRE/LCM style. > > > >> > > > >> yes, it is usual code placement problem. It is special case because the > > > >> zero register is not modified by the conversion (just we need to have > > > >> zero somewhere). So basically we do not have kills to the zero except > > > >> for entry block. > > > >> > > > > > > > > Do you have testcase to show thatf the nearest common dominator > > > > in the loop, while all uses areout of loops, leads to suboptimal xor > > > > placement? > > > I don't have a testcase, but it's all but certain nearest common > > > dominator is going to be a suboptimal placement. That's going to create > > > paths where you're going to emit the xor when it's not used. > > > > > > The whole point of the LCM algorithms is they are optimal in terms of > > > expression evaluations. > > > > We tried LCM and it didn't work well for this case. LCM places a single > > VXOR close to the location where it is needed, which can be inside a > > loop. There is nothing wrong with the LCM algorithms. But this doesn't > > solve > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87007 > > > > where VXOR is executed multiple times inside of a function, instead of > > just once. We are investigating to generate a single VXOR at entry of the > > nearest dominator for basic blocks with SF/DF conversions, which is in > > the the fake loop that contains the whole function: > > > > bb = nearest_common_dominator_for_set (CDI_DOMINATORS, > > convert_bbs); > > while (bb->loop_father->latch > > != EXIT_BLOCK_PTR_FOR_FN (cfun)) > > bb = get_immediate_dominator (CDI_DOMINATORS, > > bb->loop_father->header); > > > > insn = BB_HEAD (bb); > > if (!NONDEBUG_INSN_P (insn)) > > insn = next_nonnote_nondebug_insn (insn); > > set = gen_rtx_SET (v4sf_const0, CONST0_RTX (V4SFmode)); > > set_insn = emit_insn_before (set, insn); > > > > Here is the updated patch. OK for trunk? > This is a GCC 8/9 regression: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87007 PING: https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00298.html -- H.J.
Re: [RFC] [Patch] [Debug] Add new FUNCTION_BEG NOTE to be used for debugging.
Ping. (note -- when running the GDB testsuite ensuring that -fstack-protector-all is used for compiling each testcase, this patch fixes over 1500 FAIL's) On 10/01/19 13:28, Matthew Malcomson wrote: > At the moment NOTE_INSN_FUNCTION_BEG is used for three different purposes. > The first is as a marker just before the first insn coming from a > "source code statement" of the function. > Bug 88432 is due to the fact that the note does not accurately point to > this logical position in a function -- in that case the stack protect > prologue is directly after NOTE_INSN_FUNCTION_BEG. > > The second is (I believe) to make assumptions about what values are in the > parameter passing registers (in alias.c and calls.c). > (I'm not sure about this second use, if I am correctly reading this code then > it seems like a bug -- e.g. asan_emit_stack_protect inserts insns in the > stream > that break the assumption that seems to be made.) > > The third is as a marker to determine where to put extra code later in > sjlj_emit_function_enter from except.c, where to insert profiling code for a > function in final.c, and where to insert variable expansion code in > pass_expand::execute from cfgexpand.c. > > These three uses seem to be at odds with each other -- insns that change the > values in the parameter passing registers store can come from automatically > inserted code like stack protection, and some requirements on where > instructions > should get inserted have moved the position of this NOTE (e.g. see bugzilla > bug > 81186). > > This patch splits the current note into two different notes, one to retain > uses > 2 and 3 above, and one for use in genrating debug information. > > The first two uses are still attached to NOTE_INSN_FUNCTION_BEG, while the > debugging use is now implemented with NOTE_INSN_DEBUG_FUNCTION_BEG. > > These two notes are put into the functions' insn chain in different > places during the expand pass, and can hence satisfy their respective > uses. > > Bootstrapped and regtested on aarch64. > TODO -- Manual tests done on resulting debug information -- yet to be > automated. > > gcc/ChangeLog: > > 2019-01-10 Matthew Malcomson > > PR debug/88432 > * cfgexpand.c (pass_expand::execute): Insert > NOTE_INSN_DEBUG_FUNCTION_BEG. > * function.c (thread_prologue_and_epilogue_insns): Account > for NOTE_INSN_DEBUG_FUNCTION_BEG. > * cfgrtl.c (duplicate_insn_chain): Account for new NOTE. > * doc/rtl.texi: Document new NOTE. > * dwarf2out.c (dwarf2out_source_line): Change comment to > reference new NOTE. > * final.c (asm_show_source): As above. > (final_scan_insn_1): Split action on NOTE_INSN_FUNCTION_BEG into > two, and move debugging info action to trigger on > NOTE_INSN_DEBUG_FUNCTION_BEG. > * insn-notes.def (INSN_NOTE): Add new NOTE. > > > > ### Attachment also inlined for ease of reply > ### > > > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > index > 60c1cfb4556e1a659db19f6719adccc1dab0fe46..491f441d01de226ba5aff2af8c71680b78648a12 > 100644 > --- a/gcc/cfgexpand.c > +++ b/gcc/cfgexpand.c > @@ -6476,6 +6476,12 @@ pass_expand::execute (function *fun) > if (crtl->stack_protect_guard && targetm.stack_protect_runtime_enabled_p > ()) > stack_protect_prologue (); > > + /* Insert a NOTE that marks the end of "generated code" and the start of > code > + that comes from the user. This is the point which dwarf2out.c will > treat > + as the beginning of the users code in this function. e.g. GDB will stop > + just after this note when breaking on entry to the function. */ > + emit_note (NOTE_INSN_DEBUG_FUNCTION_BEG); > + > expand_phi_nodes (); > > /* Release any stale SSA redirection data. */ > diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c > index > 172bdf585d036e27bcf53dba89c1ffc1b6cb84c7..d0cbca84aa3f14002a568a65e70016c3e15d6b9c > 100644 > --- a/gcc/cfgrtl.c > +++ b/gcc/cfgrtl.c > @@ -4215,6 +4215,7 @@ duplicate_insn_chain (rtx_insn *from, rtx_insn *to) > case NOTE_INSN_DELETED_DEBUG_LABEL: > /* No problem to strip these. */ > case NOTE_INSN_FUNCTION_BEG: > + case NOTE_INSN_DEBUG_FUNCTION_BEG: > /* There is always just single entry to function. */ > case NOTE_INSN_BASIC_BLOCK: > /* We should only switch text sections once. */ > diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi > index > 583291018538722a19a9baf8c46c87cbdfe34216..a50d08483de0db84378e48c9334b48ff12548190 > 100644 > --- a/gcc/doc/rtl.texi > +++ b/gcc/doc/rtl.texi > @@ -3954,6 +3954,18 @@ identifies which region is associated with these notes. > Appears at the start of the function body, after the function > prologue. > > +@findex NOTE_INSN_DEBUG_FUNCTION_BEG > +@item NOTE_INSN_DEBUG_FUNCTION_BEG > +This NOTE is inserted at the start of a function during RTL expansion. > +It is
libbacktrace patch RFC: check size passed to backtrace_get_view
I agree that checking the size passed to backtrace_get_view seems like the most reliable approach to avoid problems with large files on 32-bit systems. How does this patch look? Ian 2019-01-18 Ian Lance Taylor PR libbacktrace/88890 * mmapio.c (backtrace_get_view): Change size parameter to uint64_t. Check that value fits in size_t. * read.c (backtrace_get_view): Likewise. * internal.h (backtrace_get_view): Update declaration. * elf.c (elf_add): Pass shstrhdr->sh_size to backtrace_get_view. Index: elf.c === --- elf.c (revision 268078) +++ elf.c (working copy) @@ -2813,7 +2813,7 @@ elf_add (struct backtrace_state *state, shstr_size = shstrhdr->sh_size; shstr_off = shstrhdr->sh_offset; - if (!backtrace_get_view (state, descriptor, shstr_off, shstr_size, + if (!backtrace_get_view (state, descriptor, shstr_off, shstrhdr->sh_size, error_callback, data, _view)) goto fail; names_view_valid = 1; Index: internal.h === --- internal.h (revision 268078) +++ internal.h (working copy) @@ -179,7 +179,7 @@ struct backtrace_view /* Create a view of SIZE bytes from DESCRIPTOR at OFFSET. Store the result in *VIEW. Returns 1 on success, 0 on error. */ extern int backtrace_get_view (struct backtrace_state *state, int descriptor, - off_t offset, size_t size, + off_t offset, uint64_t size, backtrace_error_callback error_callback, void *data, struct backtrace_view *view); Index: mmapio.c === --- mmapio.c(revision 268078) +++ mmapio.c(working copy) @@ -51,7 +51,7 @@ POSSIBILITY OF SUCH DAMAGE. */ int backtrace_get_view (struct backtrace_state *state ATTRIBUTE_UNUSED, - int descriptor, off_t offset, size_t size, + int descriptor, off_t offset, uint64_t size, backtrace_error_callback error_callback, void *data, struct backtrace_view *view) { @@ -60,6 +60,12 @@ backtrace_get_view (struct backtrace_sta off_t pageoff; void *map; + if ((uint64_t) (size_t) size != size) +{ + error_callback (data, "file size too large", 0); + return 0; +} + pagesize = getpagesize (); inpage = offset % pagesize; pageoff = offset - inpage; Index: read.c === --- read.c (revision 268078) +++ read.c (working copy) @@ -46,12 +46,18 @@ POSSIBILITY OF SUCH DAMAGE. */ int backtrace_get_view (struct backtrace_state *state, int descriptor, - off_t offset, size_t size, + off_t offset, uint64_t size, backtrace_error_callback error_callback, void *data, struct backtrace_view *view) { ssize_t got; + if ((uint64_t) (size_t) size != size) +{ + error_callback (data, "file size too large", 0); + return 0; +} + if (lseek (descriptor, offset, SEEK_SET) < 0) { error_callback (data, "lseek", errno);
Re: PATCH: Updated error messages for ill-formed cases of array initialization by string literal
On Fri, 18 Jan 2019 at 13:52, Rainer Orth wrote: > > Hi Jason, > > > On 1/15/19 12:59 PM, Joseph Myers wrote: > >> On Tue, 15 Jan 2019, Jason Merrill wrote: > >> > >>> I actually incorporated the C++ part of these changes into yesterday's > >>> commit, > >>> using Martin's first suggestion. Here's the adjusted C patch, which I'd > >>> like > >>> a C maintainer to approve. > >> > >> The front-end changes are OK. However, in the testcase changes, some of > >> the new expected diagnostics are hardcoding that "unsigned int" is th > >> type of char32_t, which isn't correct for all platforms (for example, it's > >> definitely not the type when int is 16-bit). In principle the same > >> applies to diagnostics hardcoding the choice of char16_t, although > >> variations are at least less likely there. > > > > This updated patch removes {short ,}unsigned int from the expected > > diagnostics. And also improves error_init to accept additional arguments, > > like pedwarn_init already does. > > > > Tested x86_64-pc-linux-gnu. > > there are now a couple of failures on several (32-bit?) targets: > > +FAIL: gcc.dg/utf-array.c (test for errors, line 15) > +FAIL: gcc.dg/utf-array.c (test for errors, line 21) > +FAIL: gcc.dg/utf-array.c (test for errors, line 33) > > I'm seeing it on i386-pc-solaris2.11 (32-bit only), and there are also > gcc-testresults reports on i686-pc-linux-gnu, m68k-unknown-linux-gnu, > and x86_64-pc-linux-gnu. > Seeing similar errors on arm and aarch64 too. > Rainer > > -- > - > Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [EXT] Re: [Patch 2/4][Aarch64] v2: Implement Aarch64 SIMD ABI
On Thu, 17 Jan 2019 at 20:11, Steve Ellcey wrote: > > On Thu, 2019-01-17 at 09:10 +, Richard Sandiford wrote: > > > > > +static bool supported_simd_type (tree t) > > > > Missing line break after "static bool". > > Fixed. > > > > +static bool currently_supported_simd_type (tree t, tree b) > > > > Same here. > > Fixed. > > > > + return 0; > > > > The return should use tab indentation. > > Fixed. > > > > OK otherwise, thanks. > > > > Richard > > Thanks for the reviews Richard, I made those changes and checked in the > patch. That is the last of the Aarch64 SIMD / Vector ABI patches I > have so everything should be checked in and working now. > Hi Steve, I've noticed that FAIL: g++.dg/vect/simd-clone-7.cc -std=c++14 (test for warnings, line 7) (and for c++17 and c++98) when forcing -mabi=ilp32. I suspect you want to skip the test in this case? Christophe > Steve Ellcey > sell...@marvell.com
Re: [PATCH] Reset proper type on vector types (PR middle-end/88587).
On Fri, Jan 18, 2019 at 6:25 AM H.J. Lu wrote: > > On Thu, Jan 17, 2019 at 4:51 AM Richard Biener > wrote: > > > > On Thu, Jan 17, 2019 at 12:21 PM Martin Liška wrote: > > > > > > On 1/16/19 1:06 PM, Richard Biener wrote: > > > > On Wed, Jan 16, 2019 at 10:20 AM Martin Liška wrote: > > > >> > > > >> Hi. > > > >> > > > >> The patch is about resetting TYPE_MODE of vector types. This is > > > >> problematic > > > >> when an inlining among different ISAs happen. Then we end up with a > > > >> different > > > >> mode than when it's expected from debug info. > > > >> > > > >> When creating a new function decl in target_clones, we must > > > >> valid_attribute_p early > > > >> so that the declaration has a proper cl_target_.. node and so that > > > >> inliner can > > > >> fix modes. > > > >> > > > >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > > >> > > > >> Ready to be installed? > > > > > > > > I don't like the new failure mode too much. It looks like > > > > create_version_clone_with_body > > > > can fail so why not simply return NULL when > > > > targetm.target_option.valid_attribute_p > > > > returns false and handle that case in multi-versioning? > > > > > > > > That is, > > > > > > > > + return !seen_error (); > > > > > > > > that looks very wrong to me. > > > > > > Yep, update patch should be better. > > > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > > > > > Ready to be installed? > > > > OK. > > > > Thanks, > > Richard. > > > > > Thanks, > > > Martin > > > > > > > > > > > Richard. > > > > > > > >> Thanks, > > > >> Martin > > > >> > > > >> gcc/ChangeLog: > > > >> > > > >> 2019-01-16 Martin Liska > > > >> Richard Biener > > > >> > > > >> PR middle-end/88587 > > > >> * cgraph.h (create_version_clone_with_body): Add new argument > > > >> with attributes. > > > >> * cgraphclones.c (cgraph_node::create_version_clone): Add > > > >> DECL_ATTRIBUTES to a newly created decl. And call > > > >> valid_attribute_p so that proper cl_target_optimization_node > > > >> is set for the newly created declaration. > > > >> * multiple_target.c (create_target_clone): Set DECL_ATTRIBUTES > > > >> for declaration. > > > >> (expand_target_clones): Do not call valid_attribute_p, it must > > > >> be already done. > > > >> * tree-inline.c (copy_decl_for_dup_finish): Reset mode for > > > >> vector types. > > > >> > > > >> gcc/testsuite/ChangeLog: > > > >> > > > >> 2019-01-16 Martin Liska > > > >> > > > >> PR middle-end/88587 > > > >> * g++.target/i386/pr88587.C: New test. > > > >> * gcc.target/i386/mvc13.c: New test. > > > >> --- > > > >> gcc/cgraph.h| 7 +- > > > >> gcc/cgraphclones.c | 18 +- > > > >> gcc/multiple_target.c | 32 - > > > >> gcc/testsuite/g++.target/i386/pr88587.C | 15 > > > >> gcc/testsuite/gcc.target/i386/mvc13.c | 9 +++ > > > >> gcc/tree-inline.c | 4 > > > >> 6 files changed, 61 insertions(+), 24 deletions(-) > > > >> create mode 100644 gcc/testsuite/g++.target/i386/pr88587.C > > > >> create mode 100644 gcc/testsuite/gcc.target/i386/mvc13.c > > > >> > > > >> > > > > > It is wrong to use -m32 in dg-options: > > /* { dg-do compile } */ > /* { dg-require-ifunc "" } */ > /* { dg-options "-O -m32 -g -mno-sse" } */ > > You should use > > /* { dg-do compile { target ia32 } } */ > > Since g++.target/i386/pr88587.C doesn't support -fPIC, > > [hjl@gnu-cfl-1 gcc]$ ./xgcc -B./ > /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C > -mx32 -fpic -S > /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C:6:6: > warning: always_inline function might not be inlinable [-Wattributes] > 6 | void a() > | ^ > /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C: > In function \u2018void a2()\u2019: > /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C:6:6: > error: inlining failed in call to always_inline \u2018void a()\u2019: > function body can be overwritten at link time > /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C:13:5: > note: called from here >13 | a (); > | ~~^~ > [hjl@gnu-cfl-1 gcc]$ > > you should add -fno-pic. > I am checking in this patch as an obvious fix. -- H.J. --- diff --git a/gcc/testsuite/g++.target/i386/pr88587.C b/gcc/testsuite/g++.target/i386/pr88587.C index 6808ab68cbb..e7488e68743 100644 --- a/gcc/testsuite/g++.target/i386/pr88587.C +++ b/gcc/testsuite/g++.target/i386/pr88587.C @@ -1,6 +1,6 @@ -/* { dg-do compile } */ +/* { dg-do compile { target ia32 } } */ /* { dg-require-ifunc "" } */ -/* { dg-options "-O -m32 -g -mno-sse -Wno-attributes" } */ +/* { dg-options "-O -fno-pic -g
Re: [PATCH 7/9] [libbacktrace] Handle DW_FORM_GNU_ref_alt
On Thu, Jan 17, 2019 at 6:14 AM Tom de Vries wrote: > > On 17-01-19 01:35, Ian Lance Taylor wrote: > > On Wed, Jan 16, 2019 at 4:17 PM Tom de Vries wrote: > >> > >> this handles DW_FORM_GNU_ref_alt which references the .debug_info > >> section in the .gnu_debugaltlink file. > >> > >> OK for trunk? > >> > >> Thanks, > >> - Tom > >> > >> On 11-12-18 11:14, Tom de Vries wrote: > >>> 2018-12-10 Tom de Vries > >>> > >>> * dwarf.c (enum attr_val_encoding): Add ATTR_VAL_REF_ALT_INFO. > >>> (struct unit): Add low and high fields. > >>> (struct unit_vector): New type. > >>> (struct dwarf_data): Add units and units_counts fields. > >>> (read_attribute): Handle DW_FORM_GNU_ref_alt using > >>> ATTR_VAL_REF_ALT_INFO. > >>> (find_unit): New function. > >>> (find_address_ranges): Add and handle unit_tag parameter. > >>> (build_address_map): Add and handle units_vec parameter. > >>> (read_referenced_name_1): Handle DW_FORM_GNU_ref_alt. > >>> (build_dwarf_data): Pass units_vec to build_address_map. Store > >>> resulting > >>> units vector. > > > > > >>> @@ -281,6 +283,10 @@ struct unit > >>>/* The offset of UNIT_DATA from the start of the information for > >>> this compilation unit. */ > >>>size_t unit_data_offset; > >>> + /* Start of the compilation unit. */ > >>> + size_t low; > >>> + /* End of the compilation unit. */ > >>> + size_t high; > > > > The comments should say what low and high are measured in, which I > > guess is file offset. Or is it offset from the start of UNIT_DATA? > > Either way, If that is right, then the fields should be named > > low_offset and high_offset. Otherwise it seems easy to confuse with > > function_addrs, where low and high refer to PC values. > > > > Done. > > > Also if they are offsets from UNIT_DATA then size_t is OK, but if the > > are file offsets they should be off_t. > > > > AFAIU, in the case where off_t vs size_t would make a difference, we're > running into trouble much earlier. I've filed PR 88890 - "libbacktrace > on 32-bit system with _FILE_OFFSET_BITS == 64" ( > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88890 ) about this. > > Anyway, I've made the conservative choice of using off_t for now (but I > could argue that it's a memory offset, given that the assumption is that > the entire debug section is read into memory). Since the entire debug section is read into memory, if they are offsets from UNIT_DATA, they should be size_t, not off_t. I wasn't trying to say that we should make a conservative choice here, I was trying to say that we should make a correct choice. An offset from UNIT_DATA should be size_t, a file offset should be off_t. Ian
Re: [PATCH] Reset proper type on vector types (PR middle-end/88587).
On Thu, Jan 17, 2019 at 4:51 AM Richard Biener wrote: > > On Thu, Jan 17, 2019 at 12:21 PM Martin Liška wrote: > > > > On 1/16/19 1:06 PM, Richard Biener wrote: > > > On Wed, Jan 16, 2019 at 10:20 AM Martin Liška wrote: > > >> > > >> Hi. > > >> > > >> The patch is about resetting TYPE_MODE of vector types. This is > > >> problematic > > >> when an inlining among different ISAs happen. Then we end up with a > > >> different > > >> mode than when it's expected from debug info. > > >> > > >> When creating a new function decl in target_clones, we must > > >> valid_attribute_p early > > >> so that the declaration has a proper cl_target_.. node and so that > > >> inliner can > > >> fix modes. > > >> > > >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > >> > > >> Ready to be installed? > > > > > > I don't like the new failure mode too much. It looks like > > > create_version_clone_with_body > > > can fail so why not simply return NULL when > > > targetm.target_option.valid_attribute_p > > > returns false and handle that case in multi-versioning? > > > > > > That is, > > > > > > + return !seen_error (); > > > > > > that looks very wrong to me. > > > > Yep, update patch should be better. > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > > > Ready to be installed? > > OK. > > Thanks, > Richard. > > > Thanks, > > Martin > > > > > > > > Richard. > > > > > >> Thanks, > > >> Martin > > >> > > >> gcc/ChangeLog: > > >> > > >> 2019-01-16 Martin Liska > > >> Richard Biener > > >> > > >> PR middle-end/88587 > > >> * cgraph.h (create_version_clone_with_body): Add new argument > > >> with attributes. > > >> * cgraphclones.c (cgraph_node::create_version_clone): Add > > >> DECL_ATTRIBUTES to a newly created decl. And call > > >> valid_attribute_p so that proper cl_target_optimization_node > > >> is set for the newly created declaration. > > >> * multiple_target.c (create_target_clone): Set DECL_ATTRIBUTES > > >> for declaration. > > >> (expand_target_clones): Do not call valid_attribute_p, it must > > >> be already done. > > >> * tree-inline.c (copy_decl_for_dup_finish): Reset mode for > > >> vector types. > > >> > > >> gcc/testsuite/ChangeLog: > > >> > > >> 2019-01-16 Martin Liska > > >> > > >> PR middle-end/88587 > > >> * g++.target/i386/pr88587.C: New test. > > >> * gcc.target/i386/mvc13.c: New test. > > >> --- > > >> gcc/cgraph.h| 7 +- > > >> gcc/cgraphclones.c | 18 +- > > >> gcc/multiple_target.c | 32 - > > >> gcc/testsuite/g++.target/i386/pr88587.C | 15 > > >> gcc/testsuite/gcc.target/i386/mvc13.c | 9 +++ > > >> gcc/tree-inline.c | 4 > > >> 6 files changed, 61 insertions(+), 24 deletions(-) > > >> create mode 100644 gcc/testsuite/g++.target/i386/pr88587.C > > >> create mode 100644 gcc/testsuite/gcc.target/i386/mvc13.c > > >> > > >> > > It is wrong to use -m32 in dg-options: /* { dg-do compile } */ /* { dg-require-ifunc "" } */ /* { dg-options "-O -m32 -g -mno-sse" } */ You should use /* { dg-do compile { target ia32 } } */ Since g++.target/i386/pr88587.C doesn't support -fPIC, [hjl@gnu-cfl-1 gcc]$ ./xgcc -B./ /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C -mx32 -fpic -S /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C:6:6: warning: always_inline function might not be inlinable [-Wattributes] 6 | void a() | ^ /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C: In function \u2018void a2()\u2019: /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C:6:6: error: inlining failed in call to always_inline \u2018void a()\u2019: function body can be overwritten at link time /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.target/i386/pr88587.C:13:5: note: called from here 13 | a (); | ~~^~ [hjl@gnu-cfl-1 gcc]$ you should add -fno-pic. -- H.J.
Re: [PATCH 9/9] [libbacktrace] Add printdwarftest_dwz_cmp.sh test-case
On Thu, Jan 17, 2019 at 5:59 AM Tom de Vries wrote: > > now that the rest of the patch series has been committed, here's an > updated version of this patch that applies to trunk. I would much rather put dwarf_data into internal.h than to #include "dwarf.c" from a different file. Using #include with a .c file is just a bad path to walk down. Ian
Re: C++ PATCH for c++/78244 - narrowing conversion in template not detected, part 2
On Thu, Jan 17, 2019 at 04:17:29PM -0500, Jason Merrill wrote: > On 1/17/19 2:09 PM, Marek Polacek wrote: > > This patch ought to fix the rest of 78244, a missing narrowing warning in > > decltype. > > > > As I explained in Bugzilla, there can be three scenarios: > > > > 1) decltype is in a template and it has no dependent expressions, which > > is the problematical case. finish_compound_literal just returns the > > compound literal without checking narrowing if processing_template_decl. > > This is the sort of thing that we've been gradually fixing: if the compound > literal isn't dependent at all, we want to do the normal processing. And > then usually return a result based on the original trees rather than the > result of processing. For instance, finish_call_expr. Something like that > ought to work here, too, and be more generally applicable; this shouldn't be > limited to casting to a scalar type, casting to a known class type can also > involve narrowing. Great, that works just fine. I also had to check if the type is type-dependent, otherwise complete_type could fail. > The check in the other patch that changes instantiation_dependent_r should > be more similar to the one in finish_compound_literal. Or perhaps you could > set a flag here in finish_compound_literal to indicate that it's > instantiation-dependent, and just check that in instantiation_dependent_r. Done, but I feel bad about adding another flag. But I guess it's cheaper this way. Thanks! Bootstrapped/regtested on x86_64-linux, ok for trunk? 2019-01-18 Marek Polacek PR c++/88815 - narrowing conversion lost in decltype. PR c++/78244 - narrowing conversion in template not detected. * cp-tree.h (CONSTRUCTOR_IS_DEPENDENT): New. * pt.c (instantiation_dependent_r): Consider a CONSTRUCTOR with CONSTRUCTOR_IS_DEPENDENT instantiation-dependent. * semantics.c (finish_compound_literal): When the compound literal isn't instantiation-dependent and the type isn't type-dependent, fall back to the normal processing. Don't only call check_narrowing for scalar types. Set CONSTRUCTOR_IS_DEPENDENT. * g++.dg/cpp0x/Wnarrowing15.C: New test. * g++.dg/cpp0x/constexpr-decltype3.C: New test. * g++.dg/cpp1y/Wnarrowing1.C: New test. diff --git gcc/cp/cp-tree.h gcc/cp/cp-tree.h index 5cc8f88d522..778874cccd6 100644 --- gcc/cp/cp-tree.h +++ gcc/cp/cp-tree.h @@ -424,6 +424,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX]; DECL_FINAL_P (in FUNCTION_DECL) QUALIFIED_NAME_IS_TEMPLATE (in SCOPE_REF) DECLTYPE_FOR_INIT_CAPTURE (in DECLTYPE_TYPE) + CONSTRUCTOR_IS_DEPENDENT (in CONSTRUCTOR) TINFO_USED_TEMPLATE_ID (in TEMPLATE_INFO) PACK_EXPANSION_SIZEOF_P (in *_PACK_EXPANSION) OVL_USING_P (in OVERLOAD) @@ -4202,6 +4203,11 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter) B b{1,2}, not B b({1,2}) or B b = {1,2}. */ #define CONSTRUCTOR_IS_DIRECT_INIT(NODE) (TREE_LANG_FLAG_0 (CONSTRUCTOR_CHECK (NODE))) +/* True if this CONSTRUCTOR is instantiation-dependent and needs to be + substituted. */ +#define CONSTRUCTOR_IS_DEPENDENT(NODE) \ + (TREE_LANG_FLAG_1 (CONSTRUCTOR_CHECK (NODE))) + /* True if this CONSTRUCTOR should not be used as a variable initializer because it was loaded from a constexpr variable with mutable fields. */ #define CONSTRUCTOR_MUTABLE_POISON(NODE) \ diff --git gcc/cp/pt.c gcc/cp/pt.c index e4f76478f54..ae77bae6b29 100644 --- gcc/cp/pt.c +++ gcc/cp/pt.c @@ -25800,6 +25800,11 @@ instantiation_dependent_r (tree *tp, int *walk_subtrees, return *tp; break; +case CONSTRUCTOR: + if (CONSTRUCTOR_IS_DEPENDENT (*tp)) + return *tp; + break; + default: break; } diff --git gcc/cp/semantics.c gcc/cp/semantics.c index e654750d249..4ff09ad3fb7 100644 --- gcc/cp/semantics.c +++ gcc/cp/semantics.c @@ -2795,11 +2795,14 @@ finish_compound_literal (tree type, tree compound_literal, return error_mark_node; } - if (processing_template_decl) + if (instantiation_dependent_expression_p (compound_literal) + || dependent_type_p (type)) { TREE_TYPE (compound_literal) = type; /* Mark the expression as a compound literal. */ TREE_HAS_CONSTRUCTOR (compound_literal) = 1; + /* And as instantiation-dependent. */ + CONSTRUCTOR_IS_DEPENDENT (compound_literal) = true; if (fcl_context == fcl_c99) CONSTRUCTOR_C99_COMPOUND_LITERAL (compound_literal) = 1; return compound_literal; @@ -2822,8 +2825,7 @@ finish_compound_literal (tree type, tree compound_literal, && check_array_initializer (NULL_TREE, type, compound_literal)) return error_mark_node; compound_literal = reshape_init (type, compound_literal, complain); - if (SCALAR_TYPE_P (type) - && !BRACE_ENCLOSED_INITIALIZER_P (compound_literal) + if (!BRACE_ENCLOSED_INITIALIZER_P
Re: [arm] PR target/88799 Add +mp and +sec extensions to ARMv7-a
On 18/01/2019 11:52, Richard Earnshaw (lists) wrote: > Most armv7-a implementations support a number of basic extensions to the > architecture which are not particularly important to the compiler, but > can matter if code contains inline assembly. This patch adds support > for these extensions, based on the capabilities that GAS already > provides for the appropriate CPUs. For the purposes of multilib > selection we ignore these extensions entirely and map the extended > architecture versions down to the base versions we have already support for. > > gcc: > PR target/88799 > * config/arm/arm-cpus.in (mp): New feature. > (sec): New feature. > (fgroup ARMv7ve): Add mp and sec features. > (arch armv7-a): Add options to allow mp and sec extensions. > (cpu generic-armv7-a): Add options to allow mp and sec extensions. > (cpu cortex-a5, cpu cortex-7, cpu cortex-a9): Add mp and sec > extenstions to the base architecture. > (cpu cortex-a8): Add sec extension to the base architecture. > (cpu marvell-pj4): Add mp and sec extensions to the base architecture. > * config/arm/t-aprofile (MULTILIB_MATCHES): Map all armv7-a arch > variants down to the base v7-a varaint. > * config/arm/t-multilib (v7_a_arch_variants): New variable. > * doc/invoke.texi (ARM Options): Add +mp and +sec to the list > of permitted extensions for -march=armv7-a and for > -mcpu=generic-armv7-a. > > testsuite: > * gcc.target/arm/multilib.exp (config "aprofile"): Add tests for > mp and sec extensions to armv7-a. > > Applied to trunk. There are a couple of minor tweaks needed for the > backport to gcc-8. I'll post those shortly. > And this is the GCC-8 backport (ChangeLog identical). diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index 8ba89ae..ba194a8 100644 --- a/gcc/config/arm/arm-cpus.in +++ b/gcc/config/arm/arm-cpus.in @@ -96,6 +96,12 @@ define feature armv7em # Architecture rel 7. define feature armv7 +# MP extension to ArmV7-A +define feature mp + +# SEC extension to ArmV7-A +define feature sec + # ARM division instructions. define feature adiv @@ -240,7 +246,7 @@ define fgroup ARMv6m mode32 armv3m armv4 thumb armv5 armv5e armv6 be8 define fgroup ARMv7 ARMv6m thumb2 armv7 define fgroup ARMv7a ARMv7 notm armv6k -define fgroup ARMv7ve ARMv7a adiv tdiv lpae +define fgroup ARMv7ve ARMv7a adiv tdiv lpae mp sec define fgroup ARMv7r ARMv7a tdiv define fgroup ARMv7m ARMv7 tdiv define fgroup ARMv7em ARMv7m armv7em @@ -474,6 +480,8 @@ begin arch armv7-a base 7A profile A isa ARMv7a + option mp add mp + option sec add sec # fp => VFPv3-d16, simd => neon-vfpv3 option fp add VFPv3 FP_DBL optalias vfpv3-d16fp @@ -1224,6 +1232,8 @@ begin cpu generic-armv7-a cname genericv7a tune flags LDSCHED architecture armv7-a + option mp add mp + option sec add sec fpu vfpv3-d16 option vfpv3-d16 add VFPv3 FP_DBL option vfpv3 add VFPv3 FP_D32 @@ -1244,7 +1254,7 @@ end cpu generic-armv7-a begin cpu cortex-a5 cname cortexa5 tune flags LDSCHED - architecture armv7-a + architecture armv7-a+mp+sec fpu neon-fp16 option nosimd remove ALL_SIMD option nofp remove ALL_FP @@ -1264,7 +1274,7 @@ end cpu cortex-a7 begin cpu cortex-a8 cname cortexa8 tune flags LDSCHED - architecture armv7-a + architecture armv7-a+sec fpu neon-vfpv3 option nofp remove ALL_FP costs cortex_a8 @@ -1273,7 +1283,7 @@ end cpu cortex-a8 begin cpu cortex-a9 cname cortexa9 tune flags LDSCHED - architecture armv7-a + architecture armv7-a+mp+sec fpu neon-fp16 option nosimd remove ALL_SIMD option nofp remove ALL_FP @@ -1384,7 +1394,7 @@ end cpu cortex-m3 begin cpu marvell-pj4 tune flags LDSCHED - architecture armv7-a + architecture armv7-a+mp+sec costs marvell_pj4 end cpu marvell-pj4 diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile index 7b55599..6f319c4 100644 --- a/gcc/config/arm/t-aprofile +++ b/gcc/config/arm/t-aprofile @@ -49,14 +49,26 @@ MULTILIB_REQUIRED += mthumb/march=armv8-a+simd/mfloat-abi=softfp # Matches # Arch Matches +# Map all basic v7-a arch extensions to v7-a +MULTILIB_MATCHES += $(foreach ARCH, $(v7_a_arch_variants), \ + march?armv7-a=march?armv7-a$(ARCH)) + # Map all v7-a FP variants to vfpv3-d16 (+fp) MULTILIB_MATCHES += $(foreach ARCH, $(filter-out +fp, $(v7_a_nosimd_variants)), \ march?armv7-a+fp=march?armv7-a$(ARCH)) +MULTILIB_MATCHES += $(foreach ARCHVAR, $(v7_a_arch_variants), \ + $(foreach ARCH, $(v7_a_nosimd_variants), \ + march?armv7-a+fp=march?armv7-a$(ARCHVAR)$(ARCH))) + # Map all v7-a SIMD variants to neon-vfpv3 (+simd) MULTILIB_MATCHES += $(foreach ARCH, $(filter-out +simd, $(v7_a_simd_variants)), \ march?armv7-a+simd=march?armv7-a$(ARCH)) +MULTILIB_MATCHES += $(foreach ARCHVAR, $(v7_a_arch_variants), \ + $(foreach ARCH,
[PATCH] Fix PR88903
The following fixes wrong-code when SLP vectorizing shifts where we may end up detecting the shift amount as scalar even though it really isn't. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied on trunk sofar. Richard. 2019-01-18 Richard Biener PR tree-optimization/88903 * tree-vect-stmts.c (vectorizable_shift): Verify we see all scalar stmts a SLP shift amount is composed of when detecting shifts by scalars. * gcc.dg/vect/pr88903-1.c: New testcase. * gcc.dg/vect/pr88903-2.c: Likewise. Index: gcc/testsuite/gcc.dg/vect/pr88903-1.c === --- gcc/testsuite/gcc.dg/vect/pr88903-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/vect/pr88903-1.c (working copy) @@ -0,0 +1,26 @@ +#include "tree-vect.h" + +int x[1024]; + +void __attribute__((noinline)) +foo() +{ + for (int i = 0; i < 512; ++i) +{ + x[2*i] = x[2*i] << (i+1); + x[2*i+1] = x[2*i+1] << (i+1); +} +} + +int +main() +{ + check_vect (); + for (int i = 0; i < 1024; ++i) +x[i] = i; + foo (); + for (int i = 0; i < 1024; ++i) +if (x[i] != i << (i/2+1)) + __builtin_abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/vect/pr88903-2.c === --- gcc/testsuite/gcc.dg/vect/pr88903-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/vect/pr88903-2.c (working copy) @@ -0,0 +1,28 @@ +#include "tree-vect.h" + +int x[1024]; +int y[1024]; +int z[1024]; + +void __attribute__((noinline)) foo() +{ + for (int i = 0; i < 512; ++i) +{ + x[2*i] = x[2*i] << y[2*i]; + x[2*i+1] = x[2*i+1] << y[2*i]; + z[2*i] = y[2*i]; + z[2*i+1] = y[2*i+1]; +} +} + +int main() +{ + check_vect (); + for (int i = 0; i < 1024; ++i) +x[i] = i, y[i] = i % 8; + foo (); + for (int i = 0; i < 1024; ++i) +if (x[i] != i << ((i & ~1) % 8)) + __builtin_abort (); + return 0; +} Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c (revision 268068) +++ gcc/tree-vect-stmts.c (working copy) @@ -5540,6 +5540,15 @@ vectorizable_shift (stmt_vec_info stmt_i if (!operand_equal_p (gimple_assign_rhs2 (slpstmt), op1, 0)) scalar_shift_arg = false; } + + /* For internal SLP defs we have to make sure we see scalar stmts +for all vector elements. +??? For different vectors we could resort to a different +scalar shift operand but code-generation below simply always +takes the first. */ + if (dt[1] == vect_internal_def + && maybe_ne (nunits_out * SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node), stmts.length ())) + scalar_shift_arg = false; } /* If the shift amount is computed by a pattern stmt we cannot
Re: PATCH: Updated error messages for ill-formed cases of array initialization by string literal
Hi Jason, > On 1/15/19 12:59 PM, Joseph Myers wrote: >> On Tue, 15 Jan 2019, Jason Merrill wrote: >> >>> I actually incorporated the C++ part of these changes into yesterday's >>> commit, >>> using Martin's first suggestion. Here's the adjusted C patch, which I'd >>> like >>> a C maintainer to approve. >> >> The front-end changes are OK. However, in the testcase changes, some of >> the new expected diagnostics are hardcoding that "unsigned int" is th >> type of char32_t, which isn't correct for all platforms (for example, it's >> definitely not the type when int is 16-bit). In principle the same >> applies to diagnostics hardcoding the choice of char16_t, although >> variations are at least less likely there. > > This updated patch removes {short ,}unsigned int from the expected > diagnostics. And also improves error_init to accept additional arguments, > like pedwarn_init already does. > > Tested x86_64-pc-linux-gnu. there are now a couple of failures on several (32-bit?) targets: +FAIL: gcc.dg/utf-array.c (test for errors, line 15) +FAIL: gcc.dg/utf-array.c (test for errors, line 21) +FAIL: gcc.dg/utf-array.c (test for errors, line 33) I'm seeing it on i386-pc-solaris2.11 (32-bit only), and there are also gcc-testresults reports on i686-pc-linux-gnu, m68k-unknown-linux-gnu, and x86_64-pc-linux-gnu. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] avoid issuing -Warray-bounds during folding (PR 88800)
Hi Christophe, > After your commit (r268037), I'm seeing excess errors on some arm targets: > FAIL: c-c++-common/Wrestrict.c -Wc++-compat (test for excess errors) > Excess errors: > /gcc/testsuite/c-c++-common/Wrestrict.c:195:3: warning: 'memcpy' > accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 > bytes at offset [2, 3] [-Wrestrict] > /gcc/testsuite/c-c++-common/Wrestrict.c:202:3: warning: 'memcpy' > accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 > bytes at offset [2, 3] [-Wrestrict] > /gcc/testsuite/c-c++-common/Wrestrict.c:207:3: warning: 'memcpy' > accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 > bytes at offset [2, 3] [-Wrestrict] I'm seeing the same on sparc-sun-solaris2.*, both 32 and 64-bit. Test results for x86_64-w64-mingw32 and ia64-suse-linux-gnu show the same failure. Besides (and probably caused by the same revision), I now get +XPASS: c-c++-common/Warray-bounds-3.c -std=gnu++14 bug (test for warnings, line 161) +XPASS: c-c++-common/Warray-bounds-3.c -std=gnu++17 bug (test for warnings, line 161) +XPASS: c-c++-common/Warray-bounds-3.c -std=gnu++98 bug (test for warnings, line 161) +XPASS: c-c++-common/Warray-bounds-3.c -Wc++-compat bug (test for warnings, line 161) which is also seen on ia64-suse-linux-gnu. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [arm] PR target/88799 Add +mp and +sec extensions to ARMv7-a
And this time with the patch... On 18/01/2019 11:52, Richard Earnshaw (lists) wrote: > Most armv7-a implementations support a number of basic extensions to the > architecture which are not particularly important to the compiler, but > can matter if code contains inline assembly. This patch adds support > for these extensions, based on the capabilities that GAS already > provides for the appropriate CPUs. For the purposes of multilib > selection we ignore these extensions entirely and map the extended > architecture versions down to the base versions we have already support for. > > gcc: > PR target/88799 > * config/arm/arm-cpus.in (mp): New feature. > (sec): New feature. > (fgroup ARMv7ve): Add mp and sec features. > (arch armv7-a): Add options to allow mp and sec extensions. > (cpu generic-armv7-a): Add options to allow mp and sec extensions. > (cpu cortex-a5, cpu cortex-7, cpu cortex-a9): Add mp and sec > extenstions to the base architecture. > (cpu cortex-a8): Add sec extension to the base architecture. > (cpu marvell-pj4): Add mp and sec extensions to the base architecture. > * config/arm/t-aprofile (MULTILIB_MATCHES): Map all armv7-a arch > variants down to the base v7-a varaint. > * config/arm/t-multilib (v7_a_arch_variants): New variable. > * doc/invoke.texi (ARM Options): Add +mp and +sec to the list > of permitted extensions for -march=armv7-a and for > -mcpu=generic-armv7-a. > > testsuite: > * gcc.target/arm/multilib.exp (config "aprofile"): Add tests for > mp and sec extensions to armv7-a. > > Applied to trunk. There are a couple of minor tweaks needed for the > backport to gcc-8. I'll post those shortly. > diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index 7880c4ae347..f53bdab8ac9 100644 --- a/gcc/config/arm/arm-cpus.in +++ b/gcc/config/arm/arm-cpus.in @@ -87,6 +87,12 @@ define feature armv7em # Architecture rel 7. define feature armv7 +# MP extension to ArmV7-A +define feature mp + +# SEC extension to ArmV7-A +define feature sec + # ARM division instructions. define feature adiv @@ -237,7 +243,7 @@ define fgroup ARMv6m armv4 thumb armv5t armv5te armv6 be8 define fgroup ARMv7 ARMv6m thumb2 armv7 define fgroup ARMv7a ARMv7 notm armv6k -define fgroup ARMv7ve ARMv7a adiv tdiv lpae +define fgroup ARMv7ve ARMv7a adiv tdiv lpae mp sec define fgroup ARMv7r ARMv7a tdiv define fgroup ARMv7m ARMv7 tdiv define fgroup ARMv7em ARMv7m armv7em @@ -425,6 +431,8 @@ begin arch armv7-a base 7A profile A isa ARMv7a + option mp add mp + option sec add sec # fp => VFPv3-d16, simd => neon-vfpv3 option fp add VFPv3 FP_DBL optalias vfpv3-d16fp @@ -968,6 +976,8 @@ begin cpu generic-armv7-a cname genericv7a tune flags LDSCHED architecture armv7-a+fp + option mp add mp + option sec add sec option vfpv3-d16 add VFPv3 FP_DBL option vfpv3 add VFPv3 FP_D32 option vfpv3-d16-fp16 add VFPv3 FP_DBL fp16conv @@ -987,7 +997,7 @@ end cpu generic-armv7-a begin cpu cortex-a5 cname cortexa5 tune flags LDSCHED - architecture armv7-a+neon-fp16 + architecture armv7-a+mp+sec+neon-fp16 option nosimd remove ALL_SIMD option nofp remove ALL_FP costs cortex_a5 @@ -1009,7 +1019,7 @@ end cpu cortex-a7 begin cpu cortex-a8 cname cortexa8 tune flags LDSCHED - architecture armv7-a+simd + architecture armv7-a+sec+simd option nofp remove ALL_FP costs cortex_a8 vendor 41 @@ -1019,7 +1029,7 @@ end cpu cortex-a8 begin cpu cortex-a9 cname cortexa9 tune flags LDSCHED - architecture armv7-a+neon-fp16 + architecture armv7-a+mp+sec+neon-fp16 option nosimd remove ALL_SIMD option nofp remove ALL_FP costs cortex_a9 @@ -1140,7 +1150,7 @@ end cpu cortex-m3 begin cpu marvell-pj4 tune flags LDSCHED - architecture armv7-a + architecture armv7-a+mp+sec costs marvell_pj4 end cpu marvell-pj4 diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile index 1de5f296942..1556f1b23e3 100644 --- a/gcc/config/arm/t-aprofile +++ b/gcc/config/arm/t-aprofile @@ -49,14 +49,26 @@ MULTILIB_REQUIRED += mthumb/march=armv8-a+simd/mfloat-abi=softfp # Matches # Arch Matches +# Map all basic v7-a arch extensions to v7-a +MULTILIB_MATCHES += $(foreach ARCH, $(v7_a_arch_variants), \ + march?armv7-a=march?armv7-a$(ARCH)) + # Map all v7-a FP variants to vfpv3-d16 (+fp) MULTILIB_MATCHES += $(foreach ARCH, $(filter-out +fp, $(v7_a_nosimd_variants)), \ march?armv7-a+fp=march?armv7-a$(ARCH)) +MULTILIB_MATCHES += $(foreach ARCHVAR, $(v7_a_arch_variants), \ + $(foreach ARCH, $(v7_a_nosimd_variants), \ + march?armv7-a+fp=march?armv7-a$(ARCHVAR)$(ARCH))) + # Map all v7-a SIMD variants to neon-vfpv3 (+simd) MULTILIB_MATCHES += $(foreach ARCH, $(filter-out +simd, $(v7_a_simd_variants)), \ march?armv7-a+simd=march?armv7-a$(ARCH)) +MULTILIB_MATCHES += $(foreach
[arm] PR target/88799 Add +mp and +sec extensions to ARMv7-a
Most armv7-a implementations support a number of basic extensions to the architecture which are not particularly important to the compiler, but can matter if code contains inline assembly. This patch adds support for these extensions, based on the capabilities that GAS already provides for the appropriate CPUs. For the purposes of multilib selection we ignore these extensions entirely and map the extended architecture versions down to the base versions we have already support for. gcc: PR target/88799 * config/arm/arm-cpus.in (mp): New feature. (sec): New feature. (fgroup ARMv7ve): Add mp and sec features. (arch armv7-a): Add options to allow mp and sec extensions. (cpu generic-armv7-a): Add options to allow mp and sec extensions. (cpu cortex-a5, cpu cortex-7, cpu cortex-a9): Add mp and sec extenstions to the base architecture. (cpu cortex-a8): Add sec extension to the base architecture. (cpu marvell-pj4): Add mp and sec extensions to the base architecture. * config/arm/t-aprofile (MULTILIB_MATCHES): Map all armv7-a arch variants down to the base v7-a varaint. * config/arm/t-multilib (v7_a_arch_variants): New variable. * doc/invoke.texi (ARM Options): Add +mp and +sec to the list of permitted extensions for -march=armv7-a and for -mcpu=generic-armv7-a. testsuite: * gcc.target/arm/multilib.exp (config "aprofile"): Add tests for mp and sec extensions to armv7-a. Applied to trunk. There are a couple of minor tweaks needed for the backport to gcc-8. I'll post those shortly.
[patch] Document AMD GCN features.
Hi, This patch adds the documentation needed for the newly-added AMD GCN back end. OK to commit? Andrew Document AMD GCN. 2019-01-18 Andrew Stubbs gcc/ * doc/extend.tex (AMD GCN Function Attributes): New section. * doc/install.texi (amdgcn-unknown-amdhsa): New instructions. * doc/invoke.texi (AMD GCN Options): New section. * doc/md.texi (Constraints for Particular Machines): Add AMD GCN. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index ebd5648..465de30 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -2393,6 +2393,7 @@ GCC plugins may provide their own attributes. @menu * Common Function Attributes:: * AArch64 Function Attributes:: +* AMD GCN Function Attributes:: * ARC Function Attributes:: * ARM Function Attributes:: * AVR Function Attributes:: @@ -3954,6 +3955,96 @@ Note that CPU tuning options and attributes such as the @option{-mcpu=}, @option{-mcpu=} option or the @code{cpu=} attribute conflicts with the architectural feature rules specified above. +@node AMD GCN Function Attributes +@subsection AMD GCN Function Attributes + +These function attributes are supported by the AMD GCN back end: + +@table @code +@item amdgpu_hsa_kernel +@cindex @code{amdgpu_hsa_kernel} function attribute, AMD GCN +This attribute indicates that the corresponding function should be compiled as +a kernel function, that is an entry point that can be invoked from the host +via the HSA runtime library. By default functions are only callable only from +other GCN functions. + +This attribute is implicitly applied to any function named @code{main}, using +default parameters. + +Kernel functions may return an integer value, which will be written to a +conventional place within the HSA "kernargs" region. + +The attribute parameters configure what values are passed into the kernel +function by the GPU drivers, via the initial register state. Some values are +used by the compiler, and therefore forced on. Enabling other options may +break assumptions in the compiler and/or run-time libraries. + +@table @code +@item private_segment_buffer +Set @code{enable_sgpr_private_segment_buffer} flag. Always on (required to +locate the stack). + +@item dispatch_ptr +Set @code{enable_sgpr_dispatch_ptr} flag. Always on (required to locate the +launch dimensions). + +@item queue_ptr +Set @code{enable_sgpr_queue_ptr} flag. Always on (required to convert address +spaces). + +@item kernarg_segment_ptr +Set @code{enable_sgpr_kernarg_segment_ptr} flag. Always on (required to +locate the kernel arguments, "kernargs"). + +@item dispatch_id +Set @code{enable_sgpr_dispatch_id} flag. + +@item flat_scratch_init +Set @code{enable_sgpr_flat_scratch_init} flag. + +@item private_segment_size +Set @code{enable_sgpr_private_segment_size} flag. + +@item grid_workgroup_count_X +Set @code{enable_sgpr_grid_workgroup_count_x} flag. Always on (required to +use OpenACC/OpenMP). + +@item grid_workgroup_count_Y +Set @code{enable_sgpr_grid_workgroup_count_y} flag. + +@item grid_workgroup_count_Z +Set @code{enable_sgpr_grid_workgroup_count_z} flag. + +@item workgroup_id_X +Set @code{enable_sgpr_workgroup_id_x} flag. + +@item workgroup_id_Y +Set @code{enable_sgpr_workgroup_id_y} flag. + +@item workgroup_id_Z +Set @code{enable_sgpr_workgroup_id_z} flag. + +@item workgroup_info +Set @code{enable_sgpr_workgroup_info} flag. + +@item private_segment_wave_offset +Set @code{enable_sgpr_private_segment_wave_byte_offset} flag. Always on +(required to locate the stack). + +@item work_item_id_X +Set @code{enable_vgpr_workitem_id} parameter. Always on (can't be disabled). + +@item work_item_id_Y +Set @code{enable_vgpr_workitem_id} parameter. Always on (required to enable +vectorization.) + +@item work_item_id_Z +Set @code{enable_vgpr_workitem_id} parameter. Always on (required to use +OpenACC/OpenMP). + +@end table +@end table + @node ARC Function Attributes @subsection ARC Function Attributes diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index d5e1edb..81a15a0 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -3447,6 +3447,27 @@ This is a synonym for @samp{x86_64-*-solaris2.1[0-9]*}. @html @end html +@anchor{amdgcn-unknown-amdhsa} +@heading amdgcn-unknown-amdhsa +AMD GCN GPU target. + +Instead of GNU Binutils, you will need to install LLVM 6, or later, and copy +@file{bin/llvm-mc} to @file{amdgcn-unknown-amdhsa/bin/as}, +@file{bin/lld} to @file{amdgcn-unknown-amdhsa/bin/ld}, +@file{bin/llvm-nm} to @file{amdgcn-unknown-amdhsa/bin/nm}, and +@file{bin/llvm-ar} to both @file{bin/amdgcn-unknown-amdhsa-ar} and +@file{bin/amdgcn-unknown-amdhsa-ranlib}. + +Use Newlib (2019-01-16, or newer). + +To run the binaries, install the HSA Runtime from the +@uref{https://rocm.github.io,,ROCm Platform}, and use +@file{libexec/gcc/amdhsa-unknown-amdhsa/@var{version}/gcn-run} to launch them +on the GPU. + +@html + +@end html @anchor{arc-x-elf32} @heading arc-*-elf32 diff --git a/gcc/doc/invoke.texi
[SVE ACLE] Implement svlsl_wide
Hi, I committed the attached patch to aarch64/sve-acle-branch that implements svlsl_wide. Thanks, Prathamesh diff --git a/gcc/config/aarch64/aarch64-sve-builtins.c b/gcc/config/aarch64/aarch64-sve-builtins.c index f080a67ef00..0e3db669422 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.c +++ b/gcc/config/aarch64/aarch64-sve-builtins.c @@ -123,7 +123,10 @@ enum function_shape { The final argument must be an integer constant expression in the range [1, _BITS]. */ - SHAPE_shift_right_imm + SHAPE_shift_right_imm, + + /* sv_t svfoo_wide[_t0](sv_t, svuint64_t). */ + SHAPE_binary_wide }; /* Classifies an operation into "modes"; for example, to distinguish @@ -169,6 +172,7 @@ enum function { FUNC_svdup, FUNC_sveor, FUNC_svindex, + FUNC_svlsl_wide, FUNC_svmax, FUNC_svmad, FUNC_svmin, @@ -331,6 +335,7 @@ private: void sig_qq_ (const function_instance &, vec &); void sig_n_ (const function_instance &, vec &); void sig_qq_n_ (const function_instance &, vec &); + void sig_00i (const function_instance &, vec &); void sig_n_00i (const function_instance &, vec &); void apply_predication (const function_instance &, vec &); @@ -371,6 +376,7 @@ public: private: tree resolve_uniform (unsigned int); tree resolve_dot (); + tree resolve_binary_wide (); tree resolve_uniform_imm (unsigned int, unsigned int); bool check_first_vector_argument (unsigned int, unsigned int &, @@ -473,6 +479,7 @@ private: rtx expand_dup (); rtx expand_eor (); rtx expand_index (); + rtx expand_lsl_wide (); rtx expand_max (); rtx expand_min (); rtx expand_mad (unsigned int); @@ -581,6 +588,12 @@ static const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { #define TYPES_all_unsigned(S, D) \ S (u8), S (u16), S (u32), S (u64) +/* _s8 _s16 _s32 + _u8 _u16 _u32. */ +#define TYPES_all_bhsi(S, D) \ + S (s8), S (s16), S (s32), \ + S (u8), S (u16), S (u32) + /* _s8 _s16 _s32 _s64 _u8 _u16 _u32 _u64. */ #define TYPES_all_integer(S, D) \ @@ -625,6 +638,7 @@ DEF_SVE_TYPES_ARRAY (all_pred); DEF_SVE_TYPES_ARRAY (all_unsigned); DEF_SVE_TYPES_ARRAY (all_signed); DEF_SVE_TYPES_ARRAY (all_float); +DEF_SVE_TYPES_ARRAY (all_bhsi); DEF_SVE_TYPES_ARRAY (all_integer); DEF_SVE_TYPES_ARRAY (all_data); DEF_SVE_TYPES_ARRAY (all_sdi_and_float); @@ -891,6 +905,11 @@ arm_sve_h_builder::build (const function_group ) add_overloaded_functions (group, MODE_n); build_all (_sve_h_builder::sig_n_00i, group, MODE_n); break; + +case SHAPE_binary_wide: + add_overloaded_functions (group, MODE_none); + build_all (_sve_h_builder::sig_00i, group, MODE_none); + break; } } @@ -1028,6 +1047,17 @@ arm_sve_h_builder::sig_qq_n_ (const function_instance , types.quick_push (instance.quarter_scalar_type (0)); } +/* Describe the signature "sv_t svfoo[_t0](sv_t, svuint64_t)" + for INSTANCE in TYPES. */ +void +arm_sve_h_builder::sig_00i (const function_instance& instance, + vec ) +{ + for (unsigned i = 0; i < 2; ++i) +types.quick_push (instance.vector_type (0)); + types.quick_push (acle_vector_types[VECTOR_TYPE_svuint64_t]); +} + /* Describe the signature "sv_t svfoo[_n_t0](sv_t, uint64_t)" for INSTANCE in TYPES. */ void @@ -1190,6 +1220,7 @@ arm_sve_h_builder::get_attributes (const function_instance ) case FUNC_svdup: case FUNC_sveor: case FUNC_svindex: +case FUNC_svlsl_wide: case FUNC_svmax: case FUNC_svmad: case FUNC_svmin: @@ -1246,6 +1277,7 @@ arm_sve_h_builder::get_explicit_types (function_shape shape) case SHAPE_ternary_opt_n: case SHAPE_ternary_qq_opt_n: case SHAPE_shift_right_imm: +case SHAPE_binary_wide: return 0; } gcc_unreachable (); @@ -1325,6 +1357,8 @@ function_resolver::resolve () case SHAPE_binary_scalar: case SHAPE_inherent: break; +case SHAPE_binary_wide: + return resolve_binary_wide (); } gcc_unreachable (); } @@ -1385,6 +1419,22 @@ function_resolver::resolve_dot () return require_form (m_rfn.instance.mode, get_type_suffix (type)); } +/* Resolve a function that has SHAPE_binary_wide. */ + +tree +function_resolver::resolve_binary_wide () +{ + unsigned i, nargs; + vector_type type; + + if (!check_first_vector_argument (2, i, nargs, type) + || !require_matching_type (i, type) + || !check_argument (i + 1, VECTOR_TYPE_svuint64_t)) +return error_mark_node; + + return require_form (m_rfn.instance.mode, get_type_suffix (type)); +} + /* Like resolve_uniform, except that the final NIMM arguments have type uint64_t and must be integer constant expressions. */ tree @@ -1653,6 +1703,7 @@ function_checker::check () case SHAPE_binary_scalar: case SHAPE_ternary_opt_n: case SHAPE_ternary_qq_opt_n: +case SHAPE_binary_wide: return true; } gcc_unreachable (); @@ -1842,6 +1893,7 @@ gimple_folder::fold () case
[C++ Patch] Locations related grokdeclarator tweak
Hi, a tweak to typespec_loc, where the existing conditional turns out to be just a special case of the full min_location that we want in order to do the right thing for testcases like diagnostic/trailing1.C. Tested x86_64-linux. Thanks, Paolo. // /cp 2018-01-18 Paolo Carlini * decl.c (grokdeclarator): Fix value assigned to typespec_loc, use min_location. /testsuite 2018-01-18 Paolo Carlini * g++.dg/diagnostic/trailing1.C: New. Index: cp/decl.c === --- cp/decl.c (revision 268062) +++ cp/decl.c (working copy) @@ -10341,9 +10341,9 @@ grokdeclarator (const cp_declarator *declarator, location_t typespec_loc = smallest_type_quals_location (type_quals, declspecs->locations); + typespec_loc = min_location (typespec_loc, + declspecs->locations[ds_type_spec]); if (typespec_loc == UNKNOWN_LOCATION) -typespec_loc = declspecs->locations[ds_type_spec]; - if (typespec_loc == UNKNOWN_LOCATION) typespec_loc = input_location; /* Look inside a declarator for the name being declared Index: testsuite/g++.dg/diagnostic/trailing1.C === --- testsuite/g++.dg/diagnostic/trailing1.C (nonexistent) +++ testsuite/g++.dg/diagnostic/trailing1.C (working copy) @@ -0,0 +1,5 @@ +// { dg-do compile { target c++11 } } + +int const foo1() -> double; // { dg-error "1:.foo1. function with trailing return type" } +int volatile foo2() -> double; // { dg-error "1:.foo2. function with trailing return type" } +int const volatile foo3() -> double; // { dg-error "1:.foo3. function with trailing return type" }
Re: V3 [PATCH] c-family: Update unaligned adress of packed member check
On Thu, Jan 17, 2019 at 04:00:47PM -0800, H.J. Lu wrote: > gcc/c-family/ > > PR c/51628 > PR c/88664 > * c-common.h (warn_for_address_or_pointer_of_packed_member): > Remove the boolean argument. > * c-warn.c (check_address_of_packed_member): Renamed to ... > (check_address_or_pointer_of_packed_member): This. Also > warn pointer conversion. > (check_and_warn_address_of_packed_member): Renamed to ... > (check_and_warn_address_or_pointer_of_packed_member): This. > Also warn pointer conversion. > (warn_for_address_or_pointer_of_packed_member): Remove the > boolean argument. Don't check pointer conversion here. > > gcc/c > > PR c/51628 > PR c/88664 > * c-typeck.c (convert_for_assignment): Upate the > warn_for_address_or_pointer_of_packed_member call. > > gcc/cp > > PR c/51628 > PR c/88664 > * call.c (convert_for_arg_passing): Upate the > warn_for_address_or_pointer_of_packed_member call. > * typeck.c (convert_for_assignment): Likewise. > > gcc/testsuite/ > > PR c/51628 > PR c/88664 > * c-c++-common/pr51628-33.c: New test. > * c-c++-common/pr51628-35.c: New test. > * c-c++-common/pr88664-1.c: Likewise. > * c-c++-common/pr88664-2.c: Likewise. > * gcc.dg/pr51628-34.c: Likewise. Ok, thanks. Jakub
Re: [PATCH] Bump version of __gcov_indirect_call_profiler function as there was ABI change.
> Hi. > > Last GCOV patch renames __gcov_indirect_call_profiler_v2 to > __gcov_indirect_call_profiler_v3 > as we changed ABI and one should see a linker error instead of strange > run-time error. > That can happen when somebody mixes objects built with a different version of > compiler. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? OK, thanks!
[PATCH] Bump version of __gcov_indirect_call_profiler function as there was ABI change.
Hi. Last GCOV patch renames __gcov_indirect_call_profiler_v2 to __gcov_indirect_call_profiler_v3 as we changed ABI and one should see a linker error instead of strange run-time error. That can happen when somebody mixes objects built with a different version of compiler. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin gcc/ChangeLog: 2019-01-18 Martin Liska * params.def: Fix comment. * tree-profile.c (gimple_init_gcov_profiler): Bump function name. (gimple_gen_ic_func_profiler): Likewise. gcc/testsuite/ChangeLog: 2019-01-18 Martin Liska * gcc.dg/no_profile_instrument_function-attr-1.c: Update expected function name. libgcc/ChangeLog: 2019-01-18 Martin Liska * libgcov-profiler.c (__gcov_indirect_call_profiler_v2): Rename to ... (__gcov_indirect_call_profiler_v3): ... this. * libgcov.h (__gcov_indirect_call_profiler_v2): Likewise. (__gcov_indirect_call_profiler_v3): Likewise. * Makefile.in: Bump function name. --- gcc/params.def | 2 +- .../gcc.dg/no_profile_instrument_function-attr-1.c | 2 +- gcc/tree-profile.c | 6 +++--- libgcc/Makefile.in | 2 +- libgcc/libgcov-profiler.c | 4 ++-- libgcc/libgcov.h| 2 +- 6 files changed, 9 insertions(+), 9 deletions(-) diff --git a/gcc/params.def b/gcc/params.def index 1a2af2c80bb..e5553af63c4 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -995,7 +995,7 @@ DEFPARAM (PARAM_PROFILE_FUNC_INTERNAL_ID, /* When the parameter is 1, track the most frequent N target addresses in indirect-call profile. This disables - indirect_call_profiler_v2 which tracks single target. */ + indirect_call_profiler_v3 which tracks single target. */ DEFPARAM (PARAM_INDIR_CALL_TOPN_PROFILE, "indir-call-topn-profile", "Track top N target addresses in indirect-call profile.", diff --git a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c index 0f04fb1eedc..41d745532fa 100644 --- a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c +++ b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c @@ -19,6 +19,6 @@ int main () } /* { dg-final { scan-tree-dump-times "__gcov0\\.main.* = PROF_edge_counter" 1 "optimized"} } */ -/* { dg-final { scan-tree-dump-times "__gcov_indirect_call_profiler_v2" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__gcov_indirect_call_profiler_v3" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_time_profiler_counter = " 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_init" 1 "optimized" } } */ diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c index 5860e7c3f19..1c3034aac10 100644 --- a/gcc/tree-profile.c +++ b/gcc/tree-profile.c @@ -186,7 +186,7 @@ gimple_init_gcov_profiler (void) gcov_type_node, ptr_type_node, NULL_TREE); - profiler_fn_name = "__gcov_indirect_call_profiler_v2"; + profiler_fn_name = "__gcov_indirect_call_profiler_v3"; if (PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE)) profiler_fn_name = "__gcov_indirect_call_topn_profiler"; @@ -459,9 +459,9 @@ gimple_gen_ic_func_profiler (void) /* Insert code: if (__gcov_indirect_call_callee != NULL) - __gcov_indirect_call_profiler_v2 (profile_id, _function_decl); + __gcov_indirect_call_profiler_v3 (profile_id, _function_decl); - The function __gcov_indirect_call_profiler_v2 is responsible for + The function __gcov_indirect_call_profiler_v3 is responsible for resetting __gcov_indirect_call_callee to NULL. */ gimple_stmt_iterator gsi = gsi_start_bb (cond_bb); diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in index 8b0f0cf042b..ea390a5bbea 100644 --- a/libgcc/Makefile.in +++ b/libgcc/Makefile.in @@ -899,7 +899,7 @@ LIBGCOV_PROFILER = _gcov_interval_profiler\ _gcov_average_profiler_atomic \ _gcov_ior_profiler \ _gcov_ior_profiler_atomic \ - _gcov_indirect_call_profiler_v2 \ + _gcov_indirect_call_profiler_v3 \ _gcov_time_profiler \ _gcov_indirect_call_topn_profiler LIBGCOV_INTERFACE = _gcov_dump _gcov_flush _gcov_fork \ diff --git a/libgcc/libgcov-profiler.c b/libgcc/libgcov-profiler.c index 4cacf894174..7116330252b 100644 --- a/libgcc/libgcov-profiler.c +++ b/libgcc/libgcov-profiler.c @@ -296,7 +296,7 @@ __gcov_indirect_call_topn_profiler (gcov_type value, void* cur_func) } #endif -#ifdef L_gcov_indirect_call_profiler_v2 +#ifdef L_gcov_indirect_call_profiler_v3 /* These two variables are used to actually track caller and callee. Keep them in TLS memory so races are not common (they are written to often). @@ -318,7 +318,7 @@
[PATCH] Update error message prefix in libgcov profiling.
Hi. The patch is about more explicit error message in libgcov. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. It's approved by Honza. Thanks, Martin libgcc/ChangeLog: 2019-01-17 Martin Liska * libgcov-driver.c (GCOV_PROF_PREFIX): Define. (gcov_version): Use in gcov_error. (merge_one_data): Likewise. (dump_one_gcov): Likewise. --- libgcc/libgcov-driver.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c index 41d28ace926..5dc51df914f 100644 --- a/libgcc/libgcov-driver.c +++ b/libgcc/libgcov-driver.c @@ -53,6 +53,8 @@ static void gcov_error_exit (void); #include "gcov-io.c" +#define GCOV_PROF_PREFIX "libgcov profiling error:%s:" + struct gcov_fn_buffer { struct gcov_fn_buffer *next; @@ -151,7 +153,7 @@ buffer_fn_data (const char *filename, const struct gcov_info *gi_ptr, return _buffer->next; fail: - gcov_error ("profiling:%s:Function %u %s %u \n", filename, fn_ix, + gcov_error (GCOV_PROF_PREFIX "Function %u %s %u \n", filename, fn_ix, len ? "cannot allocate" : "counter mismatch", len ? len : ix); return (struct gcov_fn_buffer **)free_fn_data (gi_ptr, fn_buffer, ix); @@ -195,7 +197,7 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned_t version, GCOV_UNSIGNED2STRING (v, version); GCOV_UNSIGNED2STRING (e, GCOV_VERSION); - gcov_error ("profiling:%s:Version mismatch - expected %s (%.4s) " + gcov_error (GCOV_PROF_PREFIX "Version mismatch - expected %s (%.4s) " "got %s (%.4s)\n", filename? filename : ptr->filename, gcov_version_string (expected_string, e), e, @@ -234,7 +236,7 @@ merge_one_data (const char *filename, if (length != gi_ptr->stamp) { /* Read from a different compilation. Overwrite the file. */ - gcov_error ("profiling:%s:overwriting an existing profile data " + gcov_error (GCOV_PROF_PREFIX "overwriting an existing profile data " "with a different timestamp\n", filename); return 0; } @@ -314,7 +316,7 @@ merge_one_data (const char *filename, if (tag) { read_mismatch:; - gcov_error ("profiling:%s:Merge mismatch for %s %u\n", + gcov_error (GCOV_PROF_PREFIX "Merge mismatch for %s %u\n", filename, f_ix >= 0 ? "function" : "summary", f_ix < 0 ? -1 - f_ix : f_ix); return -1; @@ -322,7 +324,7 @@ merge_one_data (const char *filename, return 0; read_error: - gcov_error ("profiling:%s:%s merging\n", filename, + gcov_error (GCOV_PROF_PREFIX "%s merging\n", filename, error < 0 ? "Overflow": "Error"); return -1; } @@ -520,7 +522,8 @@ dump_one_gcov (struct gcov_info *gi_ptr, struct gcov_filename *gf, /* Merge data from file. */ if (tag != GCOV_DATA_MAGIC) { - gcov_error ("profiling:%s:Not a gcov data file\n", gf->filename); + gcov_error (GCOV_PROF_PREFIX "Not a gcov data file\n", + gf->filename); goto read_fatal; } error = merge_one_data (gf->filename, gi_ptr, ); @@ -541,8 +544,8 @@ read_fatal:; if ((error = gcov_close ())) gcov_error (error < 0 ? -"profiling:%s:Overflow writing\n" : -"profiling:%s:Error writing\n", + GCOV_PROF_PREFIX "Overflow writing\n" : + GCOV_PROF_PREFIX "Error writing\n", gf->filename); }
[PATCH] Describe better version mismatch in libgcov driver.
Hi. The patch is about better explanation of version mismatch in libgcov driver. Now we'll print: profiling:/tmp/main.gcda:Version mismatch - expected 9.0 (experimental) (A90e) got 8.2 (release) (A82*) Patch can bootstrap on x86_64-linux-gnu and survives regression tests. It's approved by Honza. Thanks, Martin libgcc/ChangeLog: 2019-01-17 Martin Liska * libgcov-driver.c (gcov_version_string): New function. (gcov_version): Convert version integer into string. --- libgcc/libgcov-driver.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c index 44f43e6e9af..41d28ace926 100644 --- a/libgcc/libgcov-driver.c +++ b/libgcc/libgcov-driver.c @@ -157,6 +157,27 @@ fail: return (struct gcov_fn_buffer **)free_fn_data (gi_ptr, fn_buffer, ix); } +/* Convert VERSION into a string description and return the it. + BUFFER is used for storage of the string. The code should be + aligned wit gcov-iov.c. */ + +static char * +gcov_version_string (char *buffer, char version[4]) +{ + if (version[0] < 'A' || version[0] > 'Z' + || version[1] < '0' || version[1] > '9' + || version[2] < '0' || version[2] > '9') +sprintf (buffer, "(unknown)"); + else +{ + unsigned major = 10 * (version[0] - 'A') + (version[1] - '0'); + unsigned minor = version[2] - '0'; + sprintf (buffer, "%u.%u (%s)", major, minor, + version[3] == '*' ? "release" : "experimental"); +} + return buffer; +} + /* Check if VERSION of the info block PTR matches libgcov one. Return 1 on success, or zero in case of versions mismatch. If FILENAME is not NULL, its value used for reporting purposes @@ -169,12 +190,16 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned_t version, if (version != GCOV_VERSION) { char v[4], e[4]; + char version_string[128], expected_string[128]; GCOV_UNSIGNED2STRING (v, version); GCOV_UNSIGNED2STRING (e, GCOV_VERSION); - gcov_error ("profiling:%s:Version mismatch - expected %.4s got %.4s\n", - filename? filename : ptr->filename, e, v); + gcov_error ("profiling:%s:Version mismatch - expected %s (%.4s) " + "got %s (%.4s)\n", + filename? filename : ptr->filename, + gcov_version_string (expected_string, e), e, + gcov_version_string (version_string, v), v); return 0; } return 1;
Re: [PATCH] [RFC] PR target/52813 and target/11807
Christophe Lyon writes: > On Fri, 11 Jan 2019 at 23:59, Jeff Law wrote: >> >> On 1/8/19 5:03 AM, Richard Sandiford wrote: >> > Bernd Edlinger writes: >> >> On 1/7/19 10:23 AM, Jakub Jelinek wrote: >> >>> On Sun, Dec 16, 2018 at 06:13:57PM +0200, Dimitar Dimitrov wrote: >> - /* Clobbering the STACK POINTER register is an error. */ >> + /* Clobbered STACK POINTER register is not saved/restored by GCC, >> + which is often unexpected by users. See PR52813. */ >> if (overlaps_hard_reg_set_p (regset, Pmode, STACK_POINTER_REGNUM)) >> { >> - error ("Stack Pointer register clobbered by %qs in %", >> regname); >> + warning (0, "Stack Pointer register clobbered by %qs in %", >> + regname); >> + warning (0, "GCC has always ignored Stack Pointer % >> clobbers"); >> >>> >> >>> Why do we write Stack Pointer rather than stack pointer? That is really >> >>> weird. The second warning would be a note based on the first one, i.e. >> >>> if (warning ()) note (); >> >>> and better have some -W* option to silence the warning. >> >>> >> >> >> >> Yes, thanks for this suggestion. >> >> >> >> Meanwhile I found out, that the stack clobber has only been ignored up to >> >> gcc-5 (at least with lra targets, not really sure about reload targets). >> >> From gcc-6 on, with the exception of PR arm/77904 which was a regression >> >> due >> >> to the underlying lra change, but fixed later, and back-ported to >> >> gcc-6.3.0, >> >> this works for all targets I tried so far. >> >> >> >> To me, it starts to look like a rather unique and useful feature, that I >> >> would >> >> like to keep working. >> > >> > Not sure what you mean by "unique". But forcing a frame is a bit of >> > a slippery concept. Force it where? For the asm only, or the whole >> > function? This depends on optimisation and hasn't been consistent >> > across GCC versions, since it depends on the shrink-wrapping >> > optimisation. (There was a similar controversy a while ago about >> > to what extent -fno-omit-frame-pointer should "force a frame".) >> > >> > The effect on the redzone seems like something that should be specified >> > explicitly rather than as an (accidental?) side effect of listing the >> > sp in the clobber list. Maybe this would be another use for the "asm >> > attributes" proposal. "noreturn" was another attribute suggested on >> > IRC yesterday. >> > >> > But either way, the general feeling seems to be that going straight to a >> > hard error is too harsh, since there's quite a bit of existing code that >> > has the clobber. This patch implements the compromise discussed on IRC >> > yesterday of making it a -Wdeprecated warning instead. >> > >> > Tested on x86_64-linux-gnu and aarch64-linux-gnu. OK to install? >> > >> > Richard >> > >> > Dimitar: sorry the run-around on this patch, and thanks for the >> > submission. It looks from all the controversy like it was a >> > long-festering PR for a reason. :-/ >> > >> > >> > 2019-01-07 Richard Sandiford >> > >> > gcc/ >> > PR inline-asm/52813 >> > * doc/extend.texi: Document that listing the stack pointer in the >> > clobber list of an asm is a deprecated feature. >> > * common.opt (Wdeprecated): Moved from c-family/c.opt. >> > * cfgexpand.c (asm_clobber_reg_is_valid): Issue a -Wdeprecated >> > warning instead of an error for clobbers of the stack pointer. >> > Add a note explaining why. >> > >> > gcc/c-family/ >> > PR inline-asm/52813 >> > * c.opt (Wdeprecated): Move documentation and variable to common.opt. >> > >> > gcc/d/ >> > PR inline-asm/52813 >> > * lang.opt (Wdeprecated): Reference common.opt instead of c.opt. >> > >> > gcc/testsuite/ >> > PR inline-asm/52813 >> > * gcc.target/i386/pr52813.c (test1): Turn the diagnostic into a >> > -Wdeprecated warning and expect a following note:. >> OK. >> >> FWIW the number of packages affected in Fedora was in single digits, >> some of which have already been fixed. >> >> But if folks want to go with a deprecated warning instead of straight to >> an error, I won't complain. >> >> jeff > > > Hi, > > I originally complained because the arm test for pr77904.c was failing. > Since Richard's change that test emits a warning rather than an error, > but still fails. This small patch adds the missing dg-warning. > > OK? > > Thanks, > > Christophe > > 2019-01-17 Christophe Lyon > > * gcc.target/arm/pr77904.c: Add dg-warning for sp clobber. OK, thanks. Richard
Re: [PATCH] avoid issuing -Warray-bounds during folding (PR 88800)
Hi Martin, On Thu, 17 Jan 2019 at 02:51, Martin Sebor wrote: > > On 1/16/19 6:14 PM, Jeff Law wrote: > > On 1/15/19 8:21 AM, Martin Sebor wrote: > >> On 1/15/19 4:07 AM, Richard Biener wrote: > >>> On Tue, Jan 15, 2019 at 1:08 AM Martin Sebor wrote: > > The gimple_fold_builtin_memory_op() function folds calls to memcpy > and similar to MEM_REF when the size of the copy is a small power > of 2, but it does so without considering whether the copy might > write (or read) past the end of one of the objects. To detect > these kinds of errors (and help distinguish them from -Westrict) > the folder calls into the wrestrict pass and lets it diagnose them. > Unfortunately, that can lead to false positives for even some fairly > straightforward code that is ultimately found to be unreachable. > PR 88800 is a report of one such problem. > > To avoid these false positives the attached patch adjusts > the function to avoid issuing -Warray-bounds for out-of-bounds > calls to memcpy et al. Instead, the patch disables the folding > of such invalid calls (and only those). Those that are not > eliminated during DCE or other subsequent passes are eventually > diagnosed by the wrestrict pass. > > Since this change required removing the dependency of the detection > on the warning options (originally done as a micro-optimization to > avoid spending compile-time cycles on something that wasn't needed) > the patch also adds tests to verify that code generation is not > affected as a result of warnings being enabled or disabled. With > the patch as is, the invalid memcpy calls end up emitted (currently > they are folded into equally invalid MEM_REFs). At some point, > I'd like us to consider whether they should be replaced with traps > (possibly under the control of as has been proposed a number of > times in the past. If/when that's done, these tests will need to > be adjusted to look for traps instead. > > Tested on x86_64-linux. > >>> > >>> I've said in the past that I feel delaying of folding is wrong. > >>> > >>> To understand, the PR is about emitting a warning for out-of-bound > >>> accesses in a dead code region? > >> > >> Yes. I am keeping in my mind your preference of not delaying > >> the folding of valid code. > >> > >>> > >>> If we think delaying/disablign the folding is the way to go the > >>> patch looks OK. > >> > >> I do, at least for now. I'm taking this as your approval to commit > >> the patch (please let me know if you didn't mean it that way). > > Note we are in stage4, so we're supposed to be addressing regression > > bugfixes and documentation issues. > > > > So I think Richi needs to be explicit about whether or not he wants > > this in gcc-9 or if it should defer to gcc-10. > > > > I have no technical objections to the patch and would easily ack it in > > stage1 or stage3. > > The warning is a regression introduced in GCC 8. I was just about > to commit the fix so please let me know if I should hold off until > stage 1. > After your commit (r268037), I'm seeing excess errors on some arm targets: FAIL: c-c++-common/Wrestrict.c -Wc++-compat (test for excess errors) Excess errors: /gcc/testsuite/c-c++-common/Wrestrict.c:195:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] /gcc/testsuite/c-c++-common/Wrestrict.c:202:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] /gcc/testsuite/c-c++-common/Wrestrict.c:207:3: warning: 'memcpy' accessing 4 bytes at offsets [2, 3] and 0 overlaps between 1 and 2 bytes at offset [2, 3] [-Wrestrict] This is not true for all arm toolchains, so for instance if you want to reproduce it, you can build for target arm-eabi and keep default cpu/fpu/mode. Or force -march=armv5t when running the test. To give you an idea, you can look at http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/268043/report-build-info.html the red cells correspond to the regressions, you can deduce the configure flags. Christophe > Martin
Re: [PATCH] Improve stack variable reuse with inlining with exceptions (PR tree-optimization/86214)
On Fri, 18 Jan 2019, Jakub Jelinek wrote: > On Thu, Jan 17, 2019 at 03:43:05PM +0100, Jakub Jelinek wrote: > > > So we do not care to optimize this to only clobber the vars that > > > are appear live over the EH edge? > > > > Wouldn't that be quite expensive (especially at that spot in the inliner)? > > I could surely defer that (at the spot in copy_edges_for_bb just remember > > which bb was it and handle it before destroying the decl_map in > > expand_call_inline, but: > > 1) are the extra CLOBBERs that big a deal? No idea. Can you see how many we add (and to have a comparison how many were there before) when, say, building libstdc++? (bootstrap has -fno-exceptions to that's not a good test, maybe some other (smaller) C++ application?) > > 2) if they are, shouldn't we have a pass that does it generically after IPA > >and removes all CLOBBERs for vars already known dead, whether they come > >from inlining or whatever else (we have many in the IL already that are > >dead for other reasons, e.g. a variable where the destructor is inlined > >and has one CLOBBER in itself, but have another CLOBBER in the caller too > >when the var goes out of scope); tree DSE is able to remove CLOBBERs for > >the same var if they are adjacent; in either case, what would be the > >algorithm for that? Something like add_scope_conflicts algorithm, just > >not build any conflicts, but just propagate the var is live bitmaps > >and when the propagation terminates, go through all bbs and if it sees > >a clobber on a var that isn't live at that point, remove the clobber? > > That said, I think it would be doable also in the inliner if you prefer to > do it there for now, and for GCC10 other passes could use that for other > purposes. > > I'd do it in copy_cfg_body before we destroy the ->aux fields mapping src_fn > bbs to dst_fn blocks, pre_and_rev_post_order_compute_fn (id->src_cfun, ...) > + have a bitmap previously computed of interesting decls to track liveness > of (for later uses if the bitmap is NULL it could track all VAR_DECLs that > need to live in memory) and do the liveness propagation similarly to what > add_scope_conflicts does on the src_cfun bbs, then in the end just look at > srcs of EDGE_EH edges that are >= last and see what vars are live at the end > of those bbs from the bitmaps. I wonder if instead of tracking interesting vars we can compute local live-in/out during actual BB copying and then just iterate the global problem for the part going into the interesting EH edges? OTOH doing all this smells like a source of quadraticness so I hope it won't be necessary or we can instead use some heuristics to prune the set of vars to add clobbers for (can we somehow use BLOCK_VARs of the throws?) if the numbers say it looks necessary. But yes, adding them all during inlining and then having a way to prune them later would be another option. I don't want to be too clever for GCC 9 since we're quite late so another option is to limit the number of clobbers generated (just generate them for the top N of vars sorted by size (prioritizing variable-sized ones)?). Richard.
Re: [PATCH] Fix arm_neon.h #pragma GCC target syntax (PR target/88734)
Hi Jakub, On 17/01/19 13:47, Jakub Jelinek wrote: Hi! arm_neon.h on both targets contained a couple of spots with invalid #pragma GCC target syntax. This doesn't result in errors, just warnings and those warnings are surpressed in system headers, so are visible with -Wsystem-headers only. Anyway, the end result was that these pragmas were ignored, when they meant to be there. The following patch fixes it. Also, on aarch64 the sha3 intrinsics were wrapped with arch=armv8.2-a+crypto rather than arch=armv8.2-a+sha3, but because of the invalid syntax it wasn't covered in the testsuite. Without the patch, besides -Wsystem-headers warnings on it, if somebody attempts to use those intrinsics in code compiled with target options that do not include the necessary ISA features, one will get ICEs rather than errors. Bootstrapped/regtested on aarch64-linux, ok for trunk? Note, I haven't included a testcase, as I'm not familiar enough with gcc.target/aarch64/ test style, but a test would be roughly include the testcase from the PR, compile it with -march=something that doesn't include the needed ISA options, probably have a dg-skip-if if somebody overrides it from the --target_board and make sure it emits a dg-error message rather than ICE. The arm parts are ok. Thanks, Kyrill 2019-01-17 Jakub Jelinek PR target/88734 * config/arm/arm_neon.h: Fix #pragma GCC target syntax - replace (("..."))) with ("..."). * config/aarch64/arm_neon.h: Likewise. Use arch=armv8.2-a+sha3 instead of arch=armv8.2-a+crypto for vsha512hq_u64 etc. intrinsics. --- gcc/config/arm/arm_neon.h.jj2019-01-10 11:43:20.100283845 +0100 +++ gcc/config/arm/arm_neon.h 2019-01-16 17:28:32.830228005 +0100 @@ -18310,12 +18310,12 @@ vfmlsl_laneq_high_u32 (float32x2_t __r, /* AdvSIMD Complex numbers intrinsics. */ #if __ARM_ARCH >= 8 #pragma GCC push_options -#pragma GCC target(("arch=armv8.3-a")) +#pragma GCC target ("arch=armv8.3-a") #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE) #pragma GCC push_options -#pragma GCC target(("+fp16")) +#pragma GCC target ("+fp16") __extension__ extern __inline float16x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vcadd_rot90_f16 (float16x4_t __a, float16x4_t __b) --- gcc/config/aarch64/arm_neon.h.jj2019-01-10 11:43:18.620308158 +0100 +++ gcc/config/aarch64/arm_neon.h 2019-01-16 17:27:30.170252504 +0100 @@ -33070,7 +33070,7 @@ vdotq_laneq_s32 (int32x4_t __r, int8x16_ #pragma GCC pop_options #pragma GCC push_options -#pragma GCC target(("arch=armv8.2-a+sm4")) +#pragma GCC target ("arch=armv8.2-a+sm4") __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) @@ -33137,7 +33137,7 @@ vsm4ekeyq_u32 (uint32x4_t __a, uint32x4_ #pragma GCC pop_options #pragma GCC push_options -#pragma GCC target(("arch=armv8.2-a+crypto")) +#pragma GCC target ("arch=armv8.2-a+sha3") __extension__ extern __inline uint64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) @@ -33299,10 +33299,10 @@ vbcaxq_s64 (int64x2_t __a, int64x2_t __b /* AdvSIMD Complex numbers intrinsics. */ #pragma GCC push_options -#pragma GCC target(("arch=armv8.3-a")) +#pragma GCC target ("arch=armv8.3-a") #pragma GCC push_options -#pragma GCC target(("+fp16")) +#pragma GCC target ("+fp16") __extension__ extern __inline float16x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vcadd_rot90_f16 (float16x4_t __a, float16x4_t __b) @@ -33773,7 +33773,7 @@ vcmlaq_rot270_laneq_f32 (float32x4_t __r #pragma GCC pop_options #pragma GCC push_options -#pragma GCC target(("arch=armv8.2-a+fp16fml")) +#pragma GCC target ("arch=armv8.2-a+fp16fml") __extension__ extern __inline float32x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) Jakub
Re: [PATCH] Improve stack variable reuse with inlining with exceptions (PR tree-optimization/86214)
On Thu, Jan 17, 2019 at 03:43:05PM +0100, Jakub Jelinek wrote: > > So we do not care to optimize this to only clobber the vars that > > are appear live over the EH edge? > > Wouldn't that be quite expensive (especially at that spot in the inliner)? > I could surely defer that (at the spot in copy_edges_for_bb just remember > which bb was it and handle it before destroying the decl_map in > expand_call_inline, but: > 1) are the extra CLOBBERs that big a deal? > 2) if they are, shouldn't we have a pass that does it generically after IPA >and removes all CLOBBERs for vars already known dead, whether they come >from inlining or whatever else (we have many in the IL already that are >dead for other reasons, e.g. a variable where the destructor is inlined >and has one CLOBBER in itself, but have another CLOBBER in the caller too >when the var goes out of scope); tree DSE is able to remove CLOBBERs for >the same var if they are adjacent; in either case, what would be the >algorithm for that? Something like add_scope_conflicts algorithm, just >not build any conflicts, but just propagate the var is live bitmaps >and when the propagation terminates, go through all bbs and if it sees >a clobber on a var that isn't live at that point, remove the clobber? That said, I think it would be doable also in the inliner if you prefer to do it there for now, and for GCC10 other passes could use that for other purposes. I'd do it in copy_cfg_body before we destroy the ->aux fields mapping src_fn bbs to dst_fn blocks, pre_and_rev_post_order_compute_fn (id->src_cfun, ...) + have a bitmap previously computed of interesting decls to track liveness of (for later uses if the bitmap is NULL it could track all VAR_DECLs that need to live in memory) and do the liveness propagation similarly to what add_scope_conflicts does on the src_cfun bbs, then in the end just look at srcs of EDGE_EH edges that are >= last and see what vars are live at the end of those bbs from the bitmaps. Jakub
Re: Fortran vector math header
On Fri, Jan 18, 2019 at 09:18:33AM +0100, Martin Liška wrote: > What about something as simple as this: > > diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c > index 3314e176881..2f2b965f97d 100644 > --- a/gcc/fortran/decl.c > +++ b/gcc/fortran/decl.c > @@ -11361,6 +11361,11 @@ gfc_match_gcc_builtin (void) >else if (gfc_match (" ( inbranch ) ") == MATCH_YES) > clause = SIMD_INBRANCH; > > + /* Filter builtins defined only for 64-bit compilation mode. */ > + if (gfc_match (" ( 64bit ) ") == MATCH_YES > + && tree_to_uhwi (TYPE_SIZE_UNIT (long_integer_type_node)) != 64) > +return MATCH_YES; > + >if (gfc_vectorized_builtins == NULL) > gfc_vectorized_builtins = new hash_map (); > > That would allow to write: > !GCC$ builtin (cos) attributes simd (notinbranch) (64bit) That feels too hacky to me. We could have !GCC$ builtin (cos) attributes simd (notinbranch) if('x86_64-linux-gnu') or similar if we can agree and get somehow canonical names of the multilib targets based on options, or just if('lp64'), if('ilp32'), or whatever other identifiers. The multiarch-style strings I'm afraid we have no way to propagate to f951 even on multiarch targets, if I understand it right, it is present there just in the form of substrings in the multi os directories. For some other strings, we'd need to come up with something that generates the strings for us, e.g. like config/*/*-d.c does for D have something similar for Fortran, and then we could use just x86_64, x32 and x86 or whatever else we choose (I guess the OS isn't that important, different OSes would have different headers). Even x86_64 vs. x32 vs. x86 shows that it isn't possible to differentiate multilibs just based on sizes (kinds) of C types, and even querying those is complicated because one needs to use the use iso_c_binding, only: c_ptr etc. to get those into the scope, which isn't something we want in these headers. In any case, glibc would need to agree with gfortran on these identifiers. Jakub
Re: Fortran vector math header
On 1/16/19 9:35 PM, Joseph Myers wrote: > On Wed, 16 Jan 2019, Jakub Jelinek wrote: > >> Perhaps easier would be to add optional if clause to the !GCC$ builtin >> with constant expression argument which if present and evaluates to .false. >> would tell us to ignore the attribute. Or, add !GCC$ if/else/end if which >> would act like preprocessing conditionals or something similar. >> Not really sure one can query in Fortran what the multilib is some way (say >> look at size of a pointer etc.). > > If something like that is done, I'd suggest doing it in a form which > allows each multilib's information about glibc functions to go in a > separate generated header (so having !GCC$ include or similar to include a > per-multilib file, under appropriate conditionals). Otherwise you need to > bring back logic in glibc to make a compiler building glibc for one > multilib use appropriate -D and -U options to get its C headers to define > things appropriately for another multilib, so that the all-multilib > Fortran header can be generated in a single glibc build. (Like the old > logic for generating bits/syscall.h that was removed in commits > 2dba5ce7b8115d6a2789bf279892263621088e74 and > ee17d4e99af9e49378217209d3708053ef148032.) > Hi. What about something as simple as this: diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index 3314e176881..2f2b965f97d 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -11361,6 +11361,11 @@ gfc_match_gcc_builtin (void) else if (gfc_match (" ( inbranch ) ") == MATCH_YES) clause = SIMD_INBRANCH; + /* Filter builtins defined only for 64-bit compilation mode. */ + if (gfc_match (" ( 64bit ) ") == MATCH_YES + && tree_to_uhwi (TYPE_SIZE_UNIT (long_integer_type_node)) != 64) +return MATCH_YES; + if (gfc_vectorized_builtins == NULL) gfc_vectorized_builtins = new hash_map (); That would allow to write: !GCC$ builtin (cos) attributes simd (notinbranch) (64bit) Thoughts? Thanks, Martin