On Wed, 7 Aug 2024, Jakub Jelinek wrote: > Hi! > > SRA adds fancy names like offset$D94316$_M_impl$D93629$_M_start > where the numbers in there are DECL_UIDs if there are unnamed > FIELD_DECLs etc. > Because -g0 vs. -g can cause differences between the exact DECL_UID > values (add bigger gaps in between them, corresponding decls should > still be ordered the same based on DECL_UID) we make sure such > decls have DECL_NAMELESS set and depending on exact options either don't > dump such names at all or dump_fancy_name sanitizes the D123456$ parts in > there to Dxxxx$. > Unfortunately in tons of places we then use get_name to grab either user > names or these SRA created names and use that as argument to > create_tmp_var{,_name,_raw} to base other artificial temporary names based > on that. Those are DECL_NAMELESS too, but unfortunately create_tmp_var_name > starting with > https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=725494f6e4121eace43b7db1202f8ecbf52a8276 > calls clean_symbol_name which replaces the $s in there with _s and thus > dump_fancy_name doesn't sanitize it anymore. > > I don't see any discussion of that commit (originally to TM branch, later > merged) on the mailing list, but from > DECL_NAME (new_decl) > = create_tmp_var_name (IDENTIFIER_POINTER (DECL_NAME (old_decl))); > - SET_DECL_ASSEMBLER_NAME (new_decl, NULL_TREE); > + SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); > snippet elsewhere in that commit it seems create_tmp_var_name was used at > that point also to determine function names of clones, so presumably the > clean_symbol_name at that point was to ensure the symbol could be emitted > into assembly, maybe in case DECL_NAME is something like C++ operators or > whatever could have there undesirable characters. > > Anyway, we don't do that for years anymore, already GCC 4.5 uses for such > purposes clone_function_name which starts of DECL_ASSEMBLER_NAME of the old > function and appends based on supportable symbol suffix separators the > separator and some suffix and/or number, so that part doesn't go through > create_tmp_var_name. > > I don't see problems with having the $ and . etc. characters in the names > intended just to make dumps more readable, after all, we already are using > those in the SRA created names. Those names shouldn't make it into the > assembly in any way, neither debug info nor assembly labels. > > There is one theoretical case, where the gimplifier promotes automatic > vars into TREE_STATIC ones and therefore those can then appear in assembly, > just in case it would be on e.g. SRA created names and regimplified later > I've added code to ignore the names and force C.NNN if it is a DECL_NAMELESS > with problematic characters in the name. > > Richi mentioned on IRC that the non-cleaned up names might make things > harder to feed stuff back to the GIMPLE FE, but if so, I think it should be > the dumping for GIMPLE FE purposes that cleans those up (but at that point > it should also verify if some such cleaned up names don't collide with > others and somehow deal with those).
My plan was to accept an additional character in identifiers as extension with -fgimple and use that. I think we already accept dollars but dots are of course problematic and cannot be used. Replacing unwanted chars with $ and pre-existing $ with $$ might work. There's not many (ASCII) characters available, @ might be. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > The -fcompare-debug failure on the testcase is gone, but the testcase > was huge and hard to reduce. > > 2024-08-06 Jakub Jelinek <ja...@redhat.com> > > PR c++/116219 > * gimple-expr.cc (remove_suffix): Formatting fixes. > (create_tmp_var_name): Don't call clean_symbol_name. > * gimplify.cc (gimplify_init_constructor): When promoting automatic > DECL_NAMELESS vars to static, only preserve their DECL_NAME if > it doesn't contain any characters clean_symbol_name replaces. > > --- gcc/gimple-expr.cc.jj 2024-01-03 11:51:28.280776310 +0100 > +++ gcc/gimple-expr.cc 2024-08-06 14:43:42.328673383 +0200 > @@ -406,14 +406,12 @@ remove_suffix (char *name, int len) > { > int i; > > - for (i = 2; i < 7 && len > i; i++) > - { > - if (name[len - i] == '.') > - { > - name[len - i] = '\0'; > - break; > - } > - } > + for (i = 2; i < 7 && len > i; i++) > + if (name[len - i] == '.') > + { > + name[len - i] = '\0'; > + break; > + } > } > > /* Create a new temporary name with PREFIX. Return an identifier. */ > @@ -430,8 +428,6 @@ create_tmp_var_name (const char *prefix) > char *preftmp = ASTRDUP (prefix); > > remove_suffix (preftmp, strlen (preftmp)); > - clean_symbol_name (preftmp); > - > prefix = preftmp; > } > > --- gcc/gimplify.cc.jj 2024-08-05 13:04:53.903116091 +0200 > +++ gcc/gimplify.cc 2024-08-06 15:27:40.404865291 +0200 > @@ -5599,6 +5599,18 @@ gimplify_init_constructor (tree *expr_p, > > DECL_INITIAL (object) = ctor; > TREE_STATIC (object) = 1; > + if (DECL_NAME (object) && DECL_NAMELESS (object)) > + { > + const char *name = get_name (object); > + char *sname = ASTRDUP (name); > + clean_symbol_name (sname); > + /* If there are any undesirable characters in DECL_NAMELESS > + name, just fall back to C.nnn name, we must ensure e.g. > + SRA created names with DECL_UIDs don't make it into > + assembly. */ > + if (strcmp (name, sname)) > + DECL_NAME (object) = NULL_TREE; > + } Did you actually run into such a case? I'd expect gimplify_init_constructor only happening on the original GENERIC IL. In any case how about the simpler if (!DECL_NAME (object) || DECL_NAMELESS (object)) DECL_NAME (object) = create_tmp_var_name ("C"); ? Otherwise looks OK to me. I guess this should get quite some soaking time before backporting (if that was intended). Thanks, Richard. > if (!DECL_NAME (object)) > DECL_NAME (object) = create_tmp_var_name ("C"); > walk_tree (&DECL_INITIAL (object), force_labels_r, NULL, NULL); > > Jakub >