On Thu, 13 Feb 2025, Jakub Jelinek wrote: > Hi! > > The following testcase ICEs, because we have multiple levels of > DECL_VALUE_EXPR VAR_DECLs: > character(kind=1) id_string[1:.id_string] [value-expr: *id_string.55]; > character(kind=1)[1:.id_string] * id_string.55 [value-expr: > FRAME.107.id_string.55]; > integer(kind=8) .id_string [value-expr: FRAME.107..id_string]; > id_string is the user variable mentioned in BLOCK_VARS, it has > DECL_VALUE_EXPR because it is a VLA, id_string.55 is a temporary created by > gimplify_vla_decl as the address that points to the start of the VLA, what > is normally used in the IL to access it. But as this artificial var is then > used inside of a nested function, tree-nested.cc adds DECL_VALUE_EXPR to it > too and moves the actual value into the FRAME.107 object's member. > Now, remove_unused_locals removes id_string.55 (and various other VAR_DECLs) > from cfun->local_decls, simply because it is not mentioned in the IL at all > (neither is id_string itself, but that is kept in BLOCK_VARS as it has > DECL_VALUE_EXPR). So, after this point, id_string.55 tree isn't referenced > from > anywhere but id_string's DECL_VALUE_EXPR. Next GC collection is triggered, > and we are unlucky enough that in the value_expr_for_decl hash table > (underlying hash map for DECL_VALUE_EXPR) the id_string.55 entry comes > before the id_string entry. id_string is ggc_marked_p because it is > referenced from BLOCK_VARS, but id_string.55 is not, as we don't mark > DECL_VALUE_EXPR anywhere but by gt_cleare_cache on value_expr_for_decl. > But gt_cleare_cache does two things, it calls clear_slots on entries > where the key is not ggc_marked_p (so the id_string.55 mapping to > FRAME.107.id_string.55 is lost and DECL_VALUE_EXPR (id_string.55) becomes > NULL) but then later we see id_string entry, which is ggc_marked_p, so mark > the whole hash table entry, which sets ggc_set_mark on id_string.55. But > at this point its DECL_VALUE_EXPR is lost. > Later during dwarf2out.cc we want to emit DW_AT_location for id_string, see > it has DECL_VALUE_EXPR, so emit it as indirection of id_string.55 for which > we again lookup DECL_VALUE_EXPR as it has DECL_HAS_VALUE_EXPR_P, but as it > is NULL, we ICE, instead of finding it is a subobject of FRAME.107 for which > we can find its stack location. > > Now, as can be seen in the PR, I've tried to tweak tree-ssa-live.cc so that > it would keep id_string.55 in cfun->local_decls; that prohibits it from > the DECL_VALUE_EXPR of it being GC until expansion, but then we shrink and > free cfun->local_decls completely and so GC at that point still can throw > it away. > > The following patch adds an extension to the GTY ((cache)) option, before > calling the gt_cleare_cache on some hash table by specifying > GTY ((cache ("somefn"))) it calls somefn on that hash table as well. > And this extra hook can do any additional ggc_set_mark needed so that > gt_cleare_cache preserves everything that is actually needed and throws > away the rest. > > In order to make it just 2 pass rather than up to n passes - (if we had > say > id1 -> something, id2 -> x(id1), id3 -> x(id2), id4 -> x(id3), id5 -> x(id4) > in the value_expr_for_decl hash table in that order (where idN are VAR_DECLs > with DECL_HAS_VALUE_EXPR_P, id5 is the only one mentioned from outside and > idN -> X stands for idN having DECL_VALUE_EXPR X, something for some > arbitrary tree and x(idN) for some arbitrary tree which mentions idN > variable) and in each pass just marked the to part of entries with > ggc_marked_p base.from we'd need to repeat until we don't mark anything) > the patch calls walk_tree on DECL_VALUE_EXPR of the marked trees and if it > finds yet unmarked tree, it marks it and walks its DECL_VALUE_EXPR as well > the same way. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
So what this basically does is ensure we mark DECL_VALUE_EXPR when VAR is marked which isn't done when marking a tree node. That you special-case the hashtable walker is a workaround for us not being able to say struct GTY((mark_extra_stuff)) tree_decl_with_vis { on 'tree' (or specifically the structs for a VAR_DECL). And that we rely on gengtype producing the 'tree' marker. So we rely on the hashtable keeping referenced trees live. OK. Thanks, Richard. > 2025-02-13 Jakub Jelinek <ja...@redhat.com> > > PR debug/118790 > * gengtype.cc (write_roots): Remove cache variable, instead break from > the loop on match and test o for NULL. If the cache option has > non-empty string argument, call the specified function with v->name > as argument before calling gt_cleare_cache on it. > * tree.cc (gt_value_expr_mark_2, gt_value_expr_mark_1, > gt_value_expr_mark): New functions. > (value_expr_for_decl): Use GTY ((cache ("gt_value_expr_mark"))) rather > than just GTY ((cache)). > * doc/gty.texi (cache): Document optional argument of cache option. > > * gfortran.dg/gomp/pr118790.f90: New test. > > --- gcc/gengtype.cc.jj 2025-01-02 11:23:02.613710956 +0100 > +++ gcc/gengtype.cc 2025-02-12 17:15:08.560424329 +0100 > @@ -4656,13 +4656,12 @@ write_roots (pair_p variables, bool emit > outf_p f = get_output_file_with_visibility (CONST_CAST (input_file*, > v->line.file)); > struct flist *fli; > - bool cache = false; > options_p o; > > for (o = v->opt; o; o = o->next) > if (strcmp (o->name, "cache") == 0) > - cache = true; > - if (!cache) > + break; > + if (!o) > continue; > > for (fli = flp; fli; fli = fli->next) > @@ -4677,6 +4676,8 @@ write_roots (pair_p variables, bool emit > oprintf (f, " ()\n{\n"); > } > > + if (o->kind == OPTION_STRING && o->info.string && o->info.string[0]) > + oprintf (f, " %s (%s);\n", o->info.string, v->name); > oprintf (f, " gt_cleare_cache (%s);\n", v->name); > } > > --- gcc/tree.cc.jj 2025-01-20 10:26:42.216048422 +0100 > +++ gcc/tree.cc 2025-02-12 18:05:44.368264018 +0100 > @@ -211,13 +211,61 @@ struct cl_option_hasher : ggc_cache_ptr_ > > static GTY ((cache)) hash_table<cl_option_hasher> *cl_option_hash_table; > > +/* Callback called through walk_tree_1 to discover DECL_HAS_VALUE_EXPR_P > + VAR_DECLs which weren't marked yet, in that case marks them and > + walks their DECL_VALUE_EXPR expressions. */ > + > +static tree > +gt_value_expr_mark_2 (tree *tp, int *, void *data) > +{ > + tree t = *tp; > + if (VAR_P (t) && DECL_HAS_VALUE_EXPR_P (t) && !ggc_set_mark (t)) > + { > + tree dve = DECL_VALUE_EXPR (t); > + walk_tree_1 (&dve, gt_value_expr_mark_2, data, > + (hash_set<tree> *) data, NULL); > + } > + return NULL_TREE; > +} > + > +/* Callback called through traverse_noresize on the > + value_expr_for_decl hash table. */ > + > +int > +gt_value_expr_mark_1 (tree_decl_map **e, hash_set<tree> *pset) > +{ > + if (ggc_marked_p ((*e)->base.from)) > + walk_tree_1 (&(*e)->to, gt_value_expr_mark_2, pset, pset, NULL); > + return 1; > +} > + > +/* The value_expr_for_decl hash table can have mappings for trees > + which are only referenced from mappings of other trees in the > + same table, see PR118790. Without this routine, gt_cleare_cache > + could clear hash table slot of a tree which isn't marked but > + will be marked when processing later hash table slot of another > + tree which is marked. This function marks with the above > + helpers marks all the not yet marked DECL_HAS_VALUE_EXPR_P > + VAR_DECLs mentioned in DECL_VALUE_EXPR expressions of marked > + trees and in that case also recurses on their DECL_VALUE_EXPR. */ > + > +void > +gt_value_expr_mark (hash_table<tree_decl_map_cache_hasher> *h) > +{ > + if (!h) > + return; > + > + hash_set<tree> pset; > + h->traverse_noresize<hash_set<tree> *, gt_value_expr_mark_1> (&pset); > +} > + > /* General tree->tree mapping structure for use in hash tables. */ > > > static GTY ((cache)) > hash_table<tree_decl_map_cache_hasher> *debug_expr_for_decl; > > -static GTY ((cache)) > +static GTY ((cache ("gt_value_expr_mark"))) > hash_table<tree_decl_map_cache_hasher> *value_expr_for_decl; > > static GTY ((cache)) > --- gcc/doc/gty.texi.jj 2025-01-02 11:47:30.411219590 +0100 > +++ gcc/doc/gty.texi 2025-02-12 18:11:53.048139879 +0100 > @@ -313,6 +313,11 @@ Note that caches should generally use @c > @code{cache} is only preferable if the value is impractical to > recompute from the key when needed. > > +The @code{cache} option can have an optional argument, name of the function > +which should be called before @samp{gt_cleare_cache}. This can be useful > +if the hash table needs to be traversed and mark some pointers before > +@samp{gt_cleare_cache} could clear slots in it. > + > @findex deletable > @item deletable > > --- gcc/testsuite/gfortran.dg/gomp/pr118790.f90.jj 2025-02-12 > 18:19:14.799998043 +0100 > +++ gcc/testsuite/gfortran.dg/gomp/pr118790.f90 2025-02-12 > 18:18:43.606431737 +0100 > @@ -0,0 +1,182 @@ > +! PR debug/118790 > +! { dg-do compile } > +! { dg-options "-O2 -g -fopenmp --param ggc-min-expand=0 --param > ggc-min-heapsize=0" } > + > +module ec_args_mod > + private > + public :: ec_argc, ec_argv, ec_args > + interface > + function ec_argc() bind(c,name="ec_argc") result(argc) > + end function > + end interface > +contains > + function ec_argv(iarg) result(argv) > + use, intrinsic :: iso_c_binding > + character(len=:), allocatable :: argv > + type(c_ptr), pointer :: argv_cptrs(:) > + argv = to_string (argv_cptrs(iarg+1), 1024) > + end function > + subroutine ec_args() > + use, intrinsic :: iso_c_binding > + integer :: argc > + type(c_ptr) :: argv(512) > + if (ec_argc() == 0) then > + call read_command_line(argc,argv) > + end if > + end subroutine > + function to_string(cptr,maxlen) result(string) > + use, intrinsic :: iso_c_binding > + character(len=:), allocatable :: string > + type(c_ptr) :: cptr > + character(kind=c_char,len=1), pointer :: s(:) > + call c_f_pointer (cptr, s, (/maxlen/)) > + do > + if (s(i) == c_null_char) exit > + i = i + 1 > + end do > + nchars = i - 1 > + allocate (character(len=(nchars)) :: string) > + do i=1,nchars > + string(i:i) = s(i) > + end do > + end function > + subroutine read_command_line(argc,argv) > + use, intrinsic :: iso_c_binding > + integer, parameter :: cmd_max_len = 1024 * 512 > + integer(c_int) :: argc > + type(c_ptr) :: argv(:) > + character(kind=c_char,len=1), save, target :: args(cmd_max_len) > + character(kind=c_char,len=cmd_max_len), save, target :: cmd > + character(kind=c_char,len=cmd_max_len) :: arg > + integer(c_int) :: iarg, arglen, pos, ich, argpos > + do ich=1,len(cmd) > + if (cmd(ich:ich) == " ") then > + cmd(ich:ich) = c_null_char > + end if > + end do > + do iarg=1,argc > + do ich=1,arglen > + args(pos) = arg(ich:ich) > + end do > + args(pos) = c_null_char; pos = pos+1 > + argv(iarg+1) = c_loc(args(argpos)) > + end do > + end subroutine > +end module > +module mpl_mpif > + integer mpi_status_size > +end module mpl_mpif > +subroutine ec_meminfo(ku,cdstring,kcomm,kbarr,kiotask,kcall) > + use mpl_mpif > + interface > + subroutine ec_pmon(energy,power) > + end subroutine ec_pmon > + end interface > + character(len=*), intent(in) :: cdstring > + integer :: ii,jj,i,j,k,myproc,nproc,len,error,nodenum,jid > + integer :: tasksmall,nodehuge,memfree,cached,nfree > + integer :: nnuma > + integer,dimension(:),allocatable,save :: smallpage,hugepage > + integer :: n18 > + integer,dimension(:,:),allocatable,save :: node, bucket > + character(len=256) :: clstr > + character(len=20) :: nodename,lastnode,clmaxnode > + character(len=160) ::line > + character(len=5+1+len(cdstring)) :: id_string > + integer :: irecv_status(mpi_status_size) > + logical :: llnocomm, llnohdr > + logical, save :: llfirst_time = .true. > + type ranknode_t > + integer :: nodenum > + integer :: pid > + integer :: rank_world > + integer :: rank > + integer :: iorank > + integer :: nodemaster > + integer, allocatable :: coreids(:) > + character(len=len(clstr)) :: str > + end type > + type (ranknode_t), allocatable, save :: rn(:) > + integer, allocatable :: coreids(:) > + character(len=64) :: clpfx > + if (llfirst_time .and. .not. llnocomm) then > + allocate(coreids(0:maxth-1)) > + coreids(:) = -1 > +!$omp parallel num_threads(maxth) shared(coreids) private(i,myth,icoreid) > + do i=1,maxth > + icoreid = ec_coreid() > + myth = omp_get_thread_num() > + coreids(myth) = icoreid > + end do > +!$omp end parallel > + if (myproc == 0) then > + call slash_proc > + allocate(rn(0:nproc-1)) > + do i=0,nproc-1 > + rn(i)%nodenum = -1 > + if (i > 0) then > + call mpi_recv(lastnode, len(lastnode), mpi_byte, i, itag, kcomm, > irecv_status, error) > + call check_error("from > mpi_recv(lastnode)","/tmp/fiat/src/fiat/util/ec_meminfo.f90",258) > + call mpi_comm_rank(mpi_comm_world,k,error) > + rn(i)%rank = 0 > + rn(i)%str = cdstring > + rn(i)%pid = ec_getpid() > + end if > + rn(i)%rank_world = k > + rn(i)%iorank = iorank > + rn(i)%nodemaster = 0 > + call check_error("from > mpi_send(iam_nodemaster)","/tmp/fiat/src/fiat/util/ec_meminfo.f90",305) > + end do > + call mpi_send(clstr,len(clstr),mpi_byte,0,itag+5,kcomm,error) > + call mpi_send(clstr,maxth,mpi_integer4,0,itag+6,kcomm,error) > + call > mpi_recv(lastnode,1,mpi_integer4,0,itag+7,kcomm,irecv_status,error) > + end if > + end if > +contains > + subroutine slash_proc > + read(line(idx+iclkeylen-1:),*,err=99,end=98) node(:,0) > +98 continue > + do k=1,maxnuma-1 > + read(line(idx+iclkeylen-1:),*,err=99) node(0:n18-1,k) > + end do > +99 continue > + close(502) > + smallpage(:) = 0 > + do k=0,nnuma-1 > + do j=0,n18-1 > + end do > + smallpage(k) = sum(bucket(0:8,k))/onemega > + hugepage(k) = sum(bucket(9:n18-1,k))/onemega > + end do > + open(file="/proc/meminfo",unit=502,status="old",action="read",err=977) > + do i=1,10 > + read(502,'(a)',err=988,end=988) line > + if(line(1:7) == "memfree") then > + read(line(9:80),*) memfree > + else if(line(1:6) == "cached") then > + read(line(8:80),*) cached > + end if > + end do > +988 continue > + close(502) > +977 continue > + memfree=memfree/1024 > + end subroutine slash_proc > + subroutine prt_data(kun,knodenum,cdlastnode,kcall) > + character(len=4096) :: clbuf > + write(clbuf(ilen+1:),'(2x,2i8,3x,2f6.1,1x,i9,1x,i6,1x,a7,1x,a)') > trim(id_string) > + end subroutine prt_data > + subroutine check_error(clwhat,srcfile,srcline) > + character(len=*), intent(in) :: clwhat, srcfile > + integer, intent(in) :: srcline > + if (error /= 0) then > + write(0,'(a,i0,1x,a,1x,"(",a,":",i0,")")') & > + & clpfx(1:ipfxlen)//"## ec_meminfo error code > =",error,clwhat,srcfile,srcline > + call mpi_abort(kcomm,-1,error) > + end if > + end subroutine check_error > + subroutine rnsort(kun) > + do i=0,nproc-1 > + end do > + end subroutine rnsort > +end subroutine ec_meminfo > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)