[PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
This is an attempt to fix PR target/80968. This bug has existed basically forever. The stack_tie sequence seems to be how other targets deal with this issue. I only emit this when alloca is used. If there are other conditions that potentially would necessitate such a barrier, just let me know. sparc: Fix stack references in return delay slot. gcc/ * config/sparc/sparc.md (UNSPEC_TIE): New unspec. (stack_tie): New pattern. * config/sparc/sparc.c (sparc_emit_stack_tie): New function. (sparc_expand_prologue): Call it if function uses alloca. gcc/testsuite/ * gcc.target/sparc/sparc-ret-3.c: New test. --- gcc/config/sparc/sparc.c | 15 gcc/config/sparc/sparc.md| 11 ++ gcc/testsuite/gcc.target/sparc/sparc-ret-3.c | 53 3 files changed, 79 insertions(+) create mode 100644 gcc/testsuite/gcc.target/sparc/sparc-ret-3.c diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index 6dfb269..345bc7f 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -5784,6 +5784,18 @@ sparc_asm_function_prologue (FILE *file, HOST_WIDE_INT size ATTRIBUTE_UNUSED) sparc_output_scratch_registers (file); } +/* This ties together stack memory (MEM with an alias set of frame_alias_set) + and the change to the stack pointer. */ + +static void +sparc_emit_stack_tie (void) +{ + rtx mem = gen_frame_mem (BLKmode, + gen_rtx_REG (Pmode, STACK_POINTER_REGNUM)); + + emit_insn (gen_stack_tie (mem)); +} + /* Expand the function epilogue, either normal or part of a sibcall. We emit all the instructions except the return or the call. */ @@ -5792,6 +5804,9 @@ sparc_expand_epilogue (bool for_eh) { HOST_WIDE_INT size = sparc_frame_size; + if (cfun->calls_alloca) +sparc_emit_stack_tie (); + if (sparc_n_global_fp_regs > 0) emit_save_or_restore_global_fp_regs (sparc_frame_base_reg, sparc_frame_base_offset diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index 737bdb3..ef00fc5 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -94,6 +94,8 @@ UNSPEC_ADDV UNSPEC_SUBV UNSPEC_NEGV + + UNSPEC_TIE ]) (define_c_enum "unspecv" [ @@ -8473,6 +8475,15 @@ [(set_attr "type" "multi") (set_attr "length" "4")]) +;; This is used in sparc_expand_epilogue in order to prevent insns +;; referencing the stack from being placed after the deallocation of +;; the stack frame. +(define_insn "stack_tie" + [(set (match_operand:BLK 0 "memory_operand" "+m") +(unspec:BLK [(match_dup 0)] UNSPEC_TIE))] + "" + "" + [(set_attr "length" "0")]) ;; Vector instructions. diff --git a/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c new file mode 100644 index 000..7a151f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c @@ -0,0 +1,53 @@ +/* PR target/80968 */ +/* { dg-do compile } */ +/* { dg-skip-if "no register windows" { *-*-* } { "-mflat" } { "" } } */ +/* { dg-require-effective-target ilp32 } */ +/* { dg-options "-mcpu=ultrasparc -O" } */ + +/* Make sure references to the stack frame do not slip into the delay slot + of a return instruction. */ + +struct crypto_shash { + unsigned int descsize; +}; +struct crypto_shash *tfm; + +struct shash_desc { + struct crypto_shash *tfm; + unsigned int flags; + + void *__ctx[] __attribute__((aligned(8))); +}; + +static inline unsigned int crypto_shash_descsize(struct crypto_shash *tfm) +{ + return tfm->descsize; +} + +static inline void *shash_desc_ctx(struct shash_desc *desc) +{ + return desc->__ctx; +} + +#define SHASH_DESC_ON_STACK(shash, ctx) \ + char __##shash##_desc[sizeof(struct shash_desc) + \ + crypto_shash_descsize(ctx)] __attribute__((aligned(8))); \ + struct shash_desc *shash = (struct shash_desc *)__##shash##_desc + +extern int crypto_shash_update(struct shash_desc *, const void *, unsigned int); + +unsigned int bug(unsigned int crc, const void *address, unsigned int length) +{ + SHASH_DESC_ON_STACK(shash, tfm); + unsigned int *ctx = (unsigned int *)shash_desc_ctx(shash); + int err; + + shash->tfm = tfm; + shash->flags = 0; + *ctx = crc; + + err = crypto_shash_update(shash, address, length); + + return *ctx; +} +/* { dg-final { scan-assembler "ld\[ \t\]*\\\[%i5\\+8\\\], %i0\n\[^\n\]*return\[ \t\]*%i7\\+8" } } */ -- 2.1.2.532.g19b5d50
Re: [rs6000] Fix ICE with -fstack-limit-register and large frames
On Sat, Jun 03, 2017 at 12:34:21PM +0200, Eric Botcazou wrote: > > Because you cannot during reload, or another reason? We always use LRA > > on powerpc nowadays, and LRA can deal with this. > > Because you cannot during prologue/epilogue generation. Ah, this code is generated only then, I see now. > > Only the first hunk (rs6000.md) applies, the rest is ignored (there is a > > blank line here instead of a diff header). > > !?? The patch contains a single hunk for config/rs6000/rs6000.c. The second hunk is the testcase. I now see it isn't even part of the patch, just pasted on. I opened PR80966. Thanks, Segher
Re: [i386] __builtin_ia32_stmxcsr could be pure
Hello, I don't think Richard's "sounds good" was meant as "ok to commit". Does an x86 maintainer want to approve or criticize the patch? https://gcc.gnu.org/ml/gcc-patches/2017-05/msg02009.html On Fri, 26 May 2017, Richard Biener wrote: On Fri, May 26, 2017 at 10:55 AM, Marc Glissewrote: Hello, glibc marks fegetround as a pure function. On x86, people tend to use _MM_GET_ROUNDING_MODE instead, which could benefit from the same. I think it is safe, but a second opinion would be welcome. Sounds good. The important part is to keep the dependency to SET_ROUNDING_MODE which is done via claiming both touch global memory. I could have handled just this builtin, but it seemed better to provide def_builtin_pure (like "const" already has) since there should be other builtins that can be marked this way (maybe the gathers?). Should work for gathers. They could even use stronger guarantees, namely a fnspec with "..R" (the pointer argument is only read from directly). Similarly scatter can use ".W" (the pointer argument is only written to directly). Richard. Bootstrap+testsuite on x86_64-pc-linux-gnu with default languages. 2017-05-29 Marc Glisse gcc/ * config/i386/i386.c (struct builtin_isa): New field pure_p. Reorder for compactness. (def_builtin, def_builtin2, ix86_add_new_builtins): Handle pure_p. (def_builtin_pure, def_builtin_pure2): New functions. (ix86_init_mmx_sse_builtins) [__builtin_ia32_stmxcsr]: Mark as pure. gcc/testsuite/ * gcc.target/i386/getround.c: New file. -- Marc Glisse -- Marc Glisse
Re: Reorgnanization of profile count maintenance code, part 1
On Thu, Jun 01, 2017 at 01:35:56PM +0200, Jan Hubicka wrote: Just some very minor nits. > Index: final.c > === > --- final.c (revision 248684) > +++ final.c (working copy) > @@ -1951,9 +1951,11 @@ dump_basic_block_info (FILE *file, rtx_i >fprintf (file, "%s BLOCK %d", ASM_COMMENT_START, bb->index); >if (bb->frequency) > fprintf (file, " freq:%d", bb->frequency); > - if (bb->count) > -fprintf (file, " count:%" PRId64, > - bb->count); > + if (bb->count.initialized_p ()) > + { > + fprintf (file, " count"); Missing colon. s/count"/count:"/ > Index: profile-count.h > === > --- profile-count.h (revision 0) > +++ profile-count.h (working copy) > +/* Main data type to hold profile counters in GCC. In most cases profile > + counts originate from profile feedback. They are 64bit integers > + representing number of executions during the train run. > + As the profile is maintained during the compilation, many adjustments are > + made. Not all transformations can be made precisely, most importantly > + when code is being duplicated. It also may happen that part of CFG has > + profile counts known while other does not - for example when LTO > optimizing s/does not/do not/;# i think > + partly profiled program or when profile was lost due to COMDAT merging. > + > + For this information profile_count trakcs more information than tracks > +class GTY(()) profile_count > +{ > + /* Value of counters which has not been intiailized. Either becuase initialized because > + initializatoin did not happen yet or because profile is unknown. */ initialization > + static profile_count uninitialized () > +{ > + profile_count c; > + c.m_val = -1; > + return c; > +} > + > + /* The profiling runtime uses gcov_type, which is usually 64bit integer. > + Conversins back and forth are used to read the coverage and get it Conversions > + profile_count += (const profile_count ) I think i saw a_count = a_count + something above and assumed you didn't have a += operator. Could thus use the terse form in the snipped code above on the patch, maybe? > + /* Return *this * num / den. */ Parameter names in comment in caps please. > Index: tree-tailcall.c > === > --- tree-tailcall.c (revision 248684) > +++ tree-tailcall.c (working copy) > @@ -767,12 +767,10 @@ adjust_return_value (basic_block bb, tre > /* Subtract COUNT and FREQUENCY from the basic block and it's > outgoing edge. */ > static void > -decrease_profile (basic_block bb, gcov_type count, int frequency) > +decrease_profile (basic_block bb, profile_count count, int frequency) > { >edge e; > - bb->count -= count; > - if (bb->count < 0) > -bb->count = 0; > + bb->count = bb->count - count; That's one of the spots where i'd have expected use of operator -=, fwiw. > Index: value-prof.c > === > --- value-prof.c (revision 248684) > +++ value-prof.c (working copy) > @@ -588,8 +588,10 @@ free_histograms (struct function *fn) > > static bool > check_counter (gimple *stmt, const char * name, > -gcov_type *count, gcov_type *all, gcov_type bb_count) > +gcov_type *count, gcov_type *all, profile_count bb_count_d) > { > + gcov_type bb_count = bb_count_d.to_gcov_type (); > + return true; On purpose? thanks,
Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements
On Sat, Jun 03, 2017 at 09:25:31AM -0700, Jerry DeLisle wrote: > On 06/03/2017 06:48 AM, Nicolas Koenig wrote: > > Hello everyone, > > > > here is a version of the patch that includes a workaround for PR 80960. I > > have > > also included a separate test case for the failure that Dominique detected. > > The > > style issues should be fixed. > > > > Regression-tested. OK for trunk? > > > > Yes, OK. There still are plenty of coding-style issues (see below). Can you please rectify them before committing? Also you change gfc-internals.texi without a ChangeLog entry. I guess this was an accident? thanks, $ contrib/check_GNU_style.sh /tmp/p9.diff Blocks of 8 spaces should be replaced with tabs. 40:+break; 55:+return false; 61:+{ 64:+ curr->block->next = NULL; 65:+ gfc_free_statements(curr); 70:+} 92:+ || ref->u.ar.dimen_type[i] != DIMEN_ELEMENT) 93:+return false; 98:+{ 111:+ iters[i] = stack_top->iter; 116:+case EXPR_CONSTANT: 120:+ switch (start->value.op.op) 125:+ std::swap(start->value.op.op1, start->value.op.op2); 130:+ || start->value.op.op1->ref) 131:+ return false; 132:+ if (!stack_top || !stack_top->iter 135:+ return false; 146:+} 160:+continue; 163:+{ 174:+ break; 214:+{ 215:+ curr->next = prev->next->next; 216:+ prev->next = curr; 219:+{ 220:+ curr->next = stack_top->code->block->next->next->next; 253:+{ 254:+ first.prev = 260:+} Trailing whitespace. 18:+ 20:+ 22:+ 25:+static bool 28:+ gfc_code *curr; 44:+ 94:+ 106:+ if (!stack_top || !stack_top->iter 108:+ iters[i] = NULL; 128:+ if ((start->value.op.op1->expr_type!= EXPR_VARIABLE 132:+ if (!stack_top || !stack_top->iter 133:+ || stack_top->iter->var->symtree 136:+ iters[i] = stack_top->iter; 152:+ new_e->rank = future_rank; 176:+ new_e->ref->u.ar.dimen_type[i] = DIMEN_RANGE; 218:+ else 244:+ 249:+ Dot, space, space, new sentence. 17:+ optimize by replacing do loops with their analog array slices. For example: There should be exactly one space between function name and parenthesis. 26:+traverse_io_block(gfc_code *code, bool *has_reached, gfc_code *prev) 60:+ if (traverse_io_block(curr->block->next, has_reached, prev)) 65:+ gfc_free_statements(curr); 74:+ gcc_assert(curr->op == EXEC_TRANSFER); 96:+ gfc_simplify_expr(start, 0); 125:+ std::swap(start->value.op.op1, start->value.op.op2); 126:+ gcc_fallthrough(); 150:+ new_e = gfc_copy_expr(curr->expr1); 154:+new_e->shape = gfc_get_shape(new_e->rank); 165:+ gfc_internal_error("bad expression"); 170:+ gfc_free_expr(new_e->ref->u.ar.start[i]); 171:+ new_e->ref->u.ar.start[i] = gfc_copy_expr(iters[i]->start); 172:+ new_e->ref->u.ar.end[i] = gfc_copy_expr(iters[i]->end); 173:+ new_e->ref->u.ar.stride[i] = gfc_copy_expr(iters[i]->step); 178:+ gfc_free_expr(new_e->ref->u.ar.start[i]); 179:+ expr = gfc_copy_expr(start); 180:+ expr->value.op.op1 = gfc_copy_expr(iters[i]->start); 182:+ gfc_simplify_expr(new_e->ref->u.ar.start[i], 0); 183:+ expr = gfc_copy_expr(start); 184:+ expr->value.op.op1 = gfc_copy_expr(iters[i]->end); 186:+ gfc_simplify_expr(new_e->ref->u.ar.end[i], 0); 187:+ switch(start->value.op.op) 191:+ new_e->ref->u.ar.stride[i] = gfc_copy_expr(iters[i]->step); 194:+ expr = gfc_copy_expr(start); 195:+ expr->value.op.op1 = gfc_copy_expr(iters[i]->step); 197:+ gfc_simplify_expr(new_e->ref->u.ar.stride[i], 0); 200:+ gfc_internal_error("bad op"); 204:+ gfc_internal_error("bad expression"); 258:+ traverse_io_block((*curr)->block->next, , prev); > > Thanks for the work. > > Jerry
Re: [PATCH 0/5 v3] Vect peeling cost model
> No regressions on s390x, x86-64 and ppc64. Bootstrapped. Patch 6 breaks no-vfa-vect-57.c on powerpc. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements
On 06/03/2017 06:48 AM, Nicolas Koenig wrote: > Hello everyone, > > here is a version of the patch that includes a workaround for PR 80960. I have > also included a separate test case for the failure that Dominique detected. > The > style issues should be fixed. > > Regression-tested. OK for trunk? > Yes, OK. Thanks for the work. Jerry
Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements
Hello everyone, here is a version of the patch that includes a workaround for PR 80960. I have also included a separate test case for the failure that Dominique detected. The style issues should be fixed. Regression-tested. OK for trunk? Nicolas Changelog: 2017-06-03 Nicolas KoenigPR fortran/35339 * frontend-passes.c (traverse_io_block): New function. (simplify_io_impl_do): New function. (optimize_namespace): Invoke gfc_code_walker with simplify_io_impl_do. 2017-06-03 Nicolas Koenig PR fortran/35339 * gfortran.dg/implied_do_io_1.f90: New Test. * gfortran.dg/implied_do_io_2.f90: New Test. Index: frontend-passes.c === --- frontend-passes.c (Revision 248553) +++ frontend-passes.c (Arbeitskopie) @@ -1064,6 +1064,263 @@ convert_elseif (gfc_code **c, int *walk_subtrees A return 0; } +struct do_stack +{ + struct do_stack *prev; + gfc_iterator *iter; + gfc_code *code; +} *stack_top; + +/* Recursively traverse the block of a WRITE or READ statement, and maybe + optimize by replacing do loops with their analog array slices. For example: + + write (*,*) (a(i), i=1,4) + + is replaced with + + write (*,*) a(1:4:1) . */ + +static bool +traverse_io_block(gfc_code *code, bool *has_reached, gfc_code *prev) +{ + gfc_code *curr; + gfc_expr *new_e, *expr, *start; + gfc_ref *ref; + struct do_stack ds_push; + int i, future_rank = 0; + gfc_iterator *iters[GFC_MAX_DIMENSIONS]; + gfc_expr *e; + + /* Find the first transfer/do statement. */ + for (curr = code; curr; curr = curr->next) +{ + if (curr->op == EXEC_DO || curr->op == EXEC_TRANSFER) +break; +} + + /* Ensure it is the only transfer/do statement because cases like + + write (*,*) (a(i), b(i), i=1,4) + + cannot be optimized. */ + + if (!curr || curr->next) +return false; + + if (curr->op == EXEC_DO) +{ + if (curr->ext.iterator->var->ref) +return false; + ds_push.prev = stack_top; + ds_push.iter = curr->ext.iterator; + ds_push.code = curr; + stack_top = _push; + if (traverse_io_block(curr->block->next, has_reached, prev)) +{ + if (curr != stack_top->code && !*has_reached) + { + curr->block->next = NULL; + gfc_free_statements(curr); + } + else + *has_reached = true; + return true; +} + return false; +} + + gcc_assert(curr->op == EXEC_TRANSFER); + + /* FIXME: Workaround for PR 80945 - array slices with deferred character + lenghts do not work. Remove this section when the PR is fixed. */ + e = curr->expr1; + if (e->expr_type == EXPR_VARIABLE && e->ts.type == BT_CHARACTER + && e->ts.deferred) +return false; + /* End of section to be removed. */ + + ref = e->ref; + if (!ref || ref->type != REF_ARRAY || ref->u.ar.codimen != 0 || ref->next) +return false; + + /* Find the iterators belonging to each variable and check conditions. */ + for (i = 0; i < ref->u.ar.dimen; i++) +{ + if (!ref->u.ar.start[i] || ref->u.ar.start[i]->ref + || ref->u.ar.dimen_type[i] != DIMEN_ELEMENT) +return false; + + start = ref->u.ar.start[i]; + gfc_simplify_expr(start, 0); + switch (start->expr_type) +{ + case EXPR_VARIABLE: + + /* write (*,*) (a(i), i=a%b,1) not handled yet. */ + if (start->ref) + return false; + + /* Check for (a(k), i=1,4) or ((a(j, i), i=1,4), j=1,4). */ + if (!stack_top || !stack_top->iter + || stack_top->iter->var->symtree != start->symtree) + iters[i] = NULL; + else + { + iters[i] = stack_top->iter; + stack_top = stack_top->prev; + future_rank++; + } + break; +case EXPR_CONSTANT: + iters[i] = NULL; + break; + case EXPR_OP: + switch (start->value.op.op) + { + case INTRINSIC_PLUS: + case INTRINSIC_TIMES: + if (start->value.op.op1->expr_type != EXPR_VARIABLE) + std::swap(start->value.op.op1, start->value.op.op2); + gcc_fallthrough(); + case INTRINSIC_MINUS: + if ((start->value.op.op1->expr_type!= EXPR_VARIABLE + && start->value.op.op2->expr_type != EXPR_CONSTANT) + || start->value.op.op1->ref) + return false; + if (!stack_top || !stack_top->iter + || stack_top->iter->var->symtree + != start->value.op.op1->symtree) + return false; + iters[i] = stack_top->iter; + stack_top = stack_top->prev; + break; + default: + return false; + } + future_rank++; + break; + default: + return false; +} +} + + /* Create new expr. */ + new_e = gfc_copy_expr(curr->expr1); + new_e->expr_type = EXPR_VARIABLE; + new_e->rank = future_rank; + if (curr->expr1->shape) +new_e->shape =
Re: [rs6000] Fix ICE with -fstack-limit-register and large frames
> Because you cannot during reload, or another reason? We always use LRA > on powerpc nowadays, and LRA can deal with this. Because you cannot during prologue/epilogue generation. > Only the first hunk (rs6000.md) applies, the rest is ignored (there is a > blank line here instead of a diff header). !?? The patch contains a single hunk for config/rs6000/rs6000.c. -- Eric Botcazou
Update baseline symbols for powerpc-linux
Committed. Andreas. * config/abi/post/powerpc-linux-gnu/baseline_symbols.txt: Update. diff --git a/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt index 742df2f20d..79bee650a2 100644 --- a/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt +++ b/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt @@ -444,6 +444,7 @@ FUNC:_ZNKSt13basic_fstreamIwSt11char_traitsIwEE7is_openEv@GLIBCXX_3.4 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6gcountEv@@GLIBCXX_3.4 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4 FUNC:_ZNKSt13basic_ostreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4 +FUNC:_ZNKSt13random_device13_M_getentropyEv@@GLIBCXX_3.4.24 FUNC:_ZNKSt13runtime_error4whatEv@@GLIBCXX_3.4 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE5rdbufEv@@GLIBCXX_3.4 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE7is_openEv@@GLIBCXX_3.4.5 @@ -1726,6 +1727,7 @@ FUNC:_ZNSsC1EPKcRKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC1EPKcjRKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC1ERKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC1ERKSs@@GLIBCXX_3.4 +FUNC:_ZNSsC1ERKSsjRKSaIcE@@GLIBCXX_3.4.23 FUNC:_ZNSsC1ERKSsjj@@GLIBCXX_3.4 FUNC:_ZNSsC1ERKSsjjRKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC1ESt16initializer_listIcERKSaIcE@@GLIBCXX_3.4.11 @@ -1739,6 +1741,7 @@ FUNC:_ZNSsC2EPKcRKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC2EPKcjRKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC2ERKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC2ERKSs@@GLIBCXX_3.4 +FUNC:_ZNSsC2ERKSsjRKSaIcE@@GLIBCXX_3.4.23 FUNC:_ZNSsC2ERKSsjj@@GLIBCXX_3.4 FUNC:_ZNSsC2ERKSsjjRKSaIcE@@GLIBCXX_3.4 FUNC:_ZNSsC2ESt16initializer_listIcERKSaIcE@@GLIBCXX_3.4.11 @@ -2382,6 +2385,7 @@ FUNC:_ZNSt15_List_node_base8transferEPS_S0_@@GLIBCXX_3.4 FUNC:_ZNSt15_List_node_base9_M_unhookEv@@GLIBCXX_3.4.14 FUNC:_ZNSt15__exception_ptr13exception_ptr4swapERS0_@@CXXABI_1.3.3 FUNC:_ZNSt15__exception_ptr13exception_ptrC1EMS0_FvvE@@CXXABI_1.3.3 +FUNC:_ZNSt15__exception_ptr13exception_ptrC1EPv@@CXXABI_1.3.11 FUNC:_ZNSt15__exception_ptr13exception_ptrC1ERKS0_@@CXXABI_1.3.3 FUNC:_ZNSt15__exception_ptr13exception_ptrC1Ev@@CXXABI_1.3.3 FUNC:_ZNSt15__exception_ptr13exception_ptrC2EMS0_FvvE@@CXXABI_1.3.3 @@ -2967,7 +2971,9 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_M_disposeEv@@GLIBCX FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_M_replaceEjjPKcj@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_S_compareEjj@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE11_M_capacityEj@@GLIBCXX_3.4.21 +FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC1EPcOS3_@@GLIBCXX_3.4.23 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC1EPcRKS3_@@GLIBCXX_3.4.21 +FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC2EPcOS3_@@GLIBCXX_3.4.23 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC2EPcRKS3_@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructEjc@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIN9__gnu_cxx17__normal_iteratorIPKcS4_vT_SB_St20forward_iterator_tag@@GLIBCXX_3.4.21 @@ -3066,6 +3072,7 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1EPKcjRKS3_@@GLIBCXX_ FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS3_@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_RKS3_@@GLIBCXX_3.4.21 +FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_jRKS3_@@GLIBCXX_3.4.23 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_jj@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_jjRKS3_@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ESt16initializer_listIcERKS3_@@GLIBCXX_3.4.21 @@ -3081,6 +3088,7 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2EPKcjRKS3_@@GLIBCXX_ FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS3_@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_RKS3_@@GLIBCXX_3.4.21 +FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_jRKS3_@@GLIBCXX_3.4.23 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_jj@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_jjRKS3_@@GLIBCXX_3.4.21 FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ESt16initializer_listIcERKS3_@@GLIBCXX_3.4.21 @@ -3106,7 +3114,9 @@ FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE10_M_disposeEv@@GLIBCX FUNC:_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE10_M_replaceEjjPKwj@@GLIBCXX_3.4.21
Re: About hang in gcov_exit with gnu arm toolchain
Hi Nathan, Please see comments inline. > On 02-Jun-2017, at 5:14 PM, Nathan Sidwellwrote: > > hi, > >> I have opened this bug >> https://bugs.launchpad.net/gcc-arm-embedded/+bug/1694644 as per the findings >> I had. > > This is Canonical's bug tracker and you seem to be reporting a defect with > their build of gcc. If that is the case, you should be talking with > Canonical. As you can see in the bug, they have redirected me to talk to gcc. Please advise on how to proceed. > > The gcc bug tracker is https://gcc.gnu.org/bugzilla/. > > > The diff you provide there seems to be detecting when you've looped 2^32 > times, because somethings scrogging the object list to become circular. > That's the actual defect that needs fixing. I studied the gcc/ and libgcc/ code. I cannot make out why this is happening, though the debugging session does show this clearly. What I am trying to do with the patch is to show what was required to make things work; however, as you said it is not the fix for the core issue. About 2^32 times: yes if we had 2^32 files in any given project, it would definitely iterate that many times and fail. In what I saw with my debugging, the circular list contains as many elements as the number of source files in a project, and the loop iterates that many times. There could be a better way to iterate through a circular list, though. > > nathan > > -- > Nathan Sidwell
Re: [PATCH][4/4] SLP induction vectorization
On June 3, 2017 1:38:14 AM GMT+02:00, Michael Meissnerwrote: >On Fri, Jun 02, 2017 at 03:22:27PM +0200, Richard Biener wrote: >> >> This implements vectorization of SLP inductions (in the not outer >loop >> vectorization case for now). >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu. >> >> More testing is appreciated, I'm throwing it at SPEC2k6 now. > >I was going to apply to the PowerPC and do a spec run but the patch for >tree-vect-loop.c doesn't apply. Could you regenerate the patch, and >also tell >me the subversion id it is based off of? Hum, it should apply cleanly (maybe one rejected hunk that I already committed separately). I'll double check on Tuesday. Richard. >Thanks in advance.