Hi,

On Mon, Dec 10, 2012 at 07:25:34PM +0100, Steven Bosscher wrote:
> Hello,
> 
> While trying to bootstrap with GCAC checking enabled and some
> instrumentation to measure how often objects are being marked, I
> noticed that a lot of cache misses happen because already-marked
> objects are being tested again (e.g. via ggc_marked_p). This struck me
> as rather odd, until I looked at what the GC markers for my favorite
> data structures (the CFG) look like.
> 
> Basic blocks live in GC space because they can point to other GC-space
> objects (rtx, gimple), and these objects can point back to the basic
> block they're contained in. In addition, the (function-specific)
> basic_block_info array points to all live basic blocks and of course
> the edges have basic_block pointers as well. Finally, loop have
> pointers to the loop header and latch basic_blocks.
> 
> The effect is that, on average, ggc_marked_p or one of its variants
> gets called 17 times on each basic_block, *per ggc_collect call*!
> 
> This turns out to be typical for many data structures. It's more
> difficult to get accurate numbers, but GCAC is especially slow when
> building a large PCH and I'm guessing that this is in part due to
> tree-to-tree pointers.
> 
> This all made me wonder why we can't use the known hierarchy of the
> intermediate representations. Ideally, there should be only two
> classes of objects: global state variables (not ideal, but a reality)
> and objects reachable from the symbol table. AFAICT anything that
> isn't in one of those two categories has probably become garbage
> somewhere along the way.

some IPA passes do have on-the side vectors with their information
about each cgraph node or edge and those are independent GC roots.
Not all, but many (e.g. inline_summary_vec or ipa_edge_args_vector) do
have pointers to other GC data, usually trees, and thus are mananged
by GC too.  Many of those trees (e.g. constants) might not be
reachable at least in LTO WPA phase.  Sure enough, inventing something
more clever for them might be a good idea.

Thanks,

Martin


> 
> As an experiment, I did a small modification to the GC walkers of the
> CFG. I'd have started with the symbol table, but there's no
> documentation and I don't know how memory is managed or what
> invariants should hold (e.g. "everything that will be output must be
> reachable via the symbol table"). Modifying the CFG was easier.
> 
> The effect for building my set of cc1-i files with a GCAC-checking
> enabled compiler is a ~20% speedup.
> 
> It seems to be that this hierarchy could also be useful if/when GCC
> will move away from generated walkers (i.e. gengtype). If we can
> somehow enforce a hierarchy, the markers become much simpler. Also,
> the hierarchy could be used to verify "no leaks" from alloc pools, and
> probably for other things I can't think of right now.
> 
> Honza, can you please add some documentation about the symtab? What do
> you think of the above from your symtab point of view?
> 
> Another "problem" with gengtype is that it doesn't know what types can
> end up in a PCH. The CFG data structures can *never* be in a PCH, but
> there still are PCH writer functions. This is true for many other data
> structures as well. Has anyone ever measured/investigated what PCH
> writer functions are actually used? Should we (as an intermediate step
> towards streaming PCHs) explicitly GTY-mark types that should have PCH
> writer functions, and not generate such functions for unmarked types?
> 
> Ciao!
> Steven


Reply via email to