cvsuser     03/01/10 07:35:51

  Added:       docs/dev infant.dev
  Log:
  Place to hold discussion of infant mortality: the various approaches,
  the tradeoffs, what we've considered and what we've discarded and why.
  
  Revision  Changes    Path
  1.1                  parrot/docs/dev/infant.dev
  
  Index: infant.dev
  ===================================================================
  =head1 TITLE
  
  Infant Mortality
  
  =head1 Overview
  
  We have a garbage collector that needs to detect all dead objects (the
  process is called DOD, for Dead Object Detection). Any Parrot
  operation that might allocate memory can potentially trigger a DOD run
  (if not enough memory is available, we need to free up some unused
  objects to give us enough room to allocate our new object. But in
  order to know what can be freed up, we need to know what's alive, so
  we do a DOD pass.)
  
  The DOD pass begins with a "root set" of objects that are known to be
  in use -- the PMC and String registers, the Parrot stack, the global
  symbol table, etc. Each of these objects is scanned for references to
  other objects, recursively. All objects found during this search are
  regarded as live. Everything else is considered dead and so is
  available for reclamation.
  
  The question of what should be in the root set is problematic.
  Consider the case where you've created a PMC. If you immediately stuff
  it into a register, then you're all set. But what if you're putting it
  into an aggregate? If the aggregate isn't large enough, you'll need to
  resize it, which might allocate memory, which might trigger a DOD run.
  And that DOD run may free your freshly-created PMC.
  
  In this case, the solution is simple: resize the array, if necessary,
  before creating the PMC. But many situations are not so simple, and
  the set of operations which could conceivably require more memory is
  vast. This problem is referred to as the "infant mortality" problem.
  
  Peter Gibbs, Mike Lambert, Dan Sugalski, and others have considered
  various solutions to this problem. Most of those solutions have been
  implemented in some form or other. This document is my attempt to
  compare and contrast all such proposals.
  
  =head1 Solution 1: Stack Walking
  
  One possible solution is to add the C stack into the root set. This
  way, temporary PMCs that have not yet been anchored to the root set
  will be found on the stack and treated as live. Actually, the C stack
  is insufficient -- you must also scan through all processor registers,
  and for some processors there may be a separate backing store for
  those registers (eg the Sparc's register windows).
  
   + No slowdown in the common case of no DOD
   + Convenient: the programmer does not need to code differently to
     accommodate the garbage collection system.
   - Unportable. Different processors need different code for scanning
     their registers, stacks, register windows, etc. Also, on some
     architectures you must scan every possible offset into the stack to
     find all the pointers, while on others you must NOT scan every
     possible offset or you'll get a bus error due to a misaligned
     access.
   - Slow DOD. A full stack walk takes quite a while, and this time
     grows with nested interpreter calls, external code's stack usage, etc.
   - Complex. The stack walk is necessarily conservative, in that it
     must consider every valid pointer on the stack to potentially be a
     traceable object. But some of those pointers may be stale, in which
     case the memory they point to may have been partially reused for
     some other purpose. Everything must operate within certain
     constraints that guarantee that no invalid pointers will be
     dereferenced and trigger a segmentation fault or bus error.
   - Another side effect of the conservative nature of stack walking is
     that the memory for these objects may never be returned to the
     system, because it is always possible that there will be a stale
     pointer lying around on the stack or in a register, and all such
     pointers will be dereferenced.
  
  =head1 Solution 2: Neonate flag
  
  There are several possible variants of the neonate flag, but the basic
  idea is to set a flag on newly created objects that prevents them from
  being collected. At some point, this flag is cleared -- either as the
  newborn object is anchored to the root set, or during the first DOD
  pass after it was anchored, or explicitly when the object is no longer
  needed.
  
   + Portable
   + Can return memory to the system when unneeded
   + Exact: the state of every object is always known precisely (no more
     "this object MIGHT still be reachable")
   - The coder must remember to clear the flag before discarding
     unanchored objects
   - The flag-clearing takes a small amount of time
   - For some variants of this scheme, some time is consumed to clear
     the flag for the common case of rooted objects
  
  =head2 Subspecies of neonate flags
  
  The variants of the neonate idea all hinge on exactly how and when the
  flag is cleared.
  
  =head2 Variant 1: explicit
  
  The flag is always explicitly cleared by the coder.
  
   + Very simple
   + Fast DOD
   - Slow for unanchored temporaries
   - Slow for anchored objects
   - Easy to forget to clear the flag for unanchored temporaries
   - Easy to forget to clear the flag for anchored objects
   - longjmp() can bypass the clearing
  
  =head2 Variant 2: explicit for temporaries, cleared during anchoring
  
  The flag is explicitly set for temporaries. All routines which anchor
  an object to the root set also clear the flag.
  
   + Simple
   + Fast DOD
   - Slow for unanchored temporaries
   - Slow for anchored objects
   - Easy to forget to clear the flag for unanchored temporaries
   - Forces all anchoring operations to set the flag (so this disallows
     direct assignment into a PMC register, for example)
   - longjmp() can bypass the clearing
  
  =head2 Variant 3: clear during DOD
  
  The neonate flag is cleared during DOD when an object is encountered
  during the recursive root set traversal. (Leopold Toetsch's trick of
  setting the live_FLAG during creation is equivalent to this variation,
  I think.)
  
   + Simple
   + Fast DOD (DOD already manipulates the flags)
   - If there are multiple DOD runs before the object is anchored or
     dies, it will be prematurely freed
  
  =head2 Variant 4: generation count
  
  This is the same as variant 3, except a "generation count" is
  maintained in the interpreter so that the neonate flag is only cleared
  during a later generation. The generation is incremented only at major
  control points such between opcodes, so that there is no chance of
  unanchored temporaries.
  
   + Fast DOD (DOD already manipulates the flags)
   - Generation count must be maintained
   - Disallows recursive opcode calls (necessary for eg implementing
     vtable methods in pasm)
   - Can temporarily use more memory (dead objects accumulate during the
     current generation)
  
  In order to allow recursive opcode calls, we could increment the
  generation count in more places and make sure nothing is left
  unanchored at those points, but that would gradually remove all
  advantages of this scheme and make it more difficult to call existing
  vtable methods (since you never know when they might start running
  pasm code.)
  
  =head2 Variant 5: generation stack
  
  Notice that when using a generational count, you really only need to
  test whether the current generation is _different_ from an object's
  creation generation (which eliminates wraparound problems, too.) So
  rather than testing against a single "current" generation, allow a
  stack of multiple "current" generations. An object encountered during
  DOD will have its neonate flag cleared only if it doesn't match any of
  the "current" generation ids. This check can be optimized using a
  conservative bit mask as a preliminary test.
  
   + Still faster DOD than stackwalking, though slower than the
     other neonate variants
   - Generation count must be maintained
   - Generation stack must be maintained
   - Disallows longjmp()'ing out of recursive opcode calls
   - Can temporarily use more memory (dead objects accumulate during all
     current generations)
  
  =head1 Solution 3: Explicit root set augmentation
  
  A final possible solution is to provide a mechanism to temporarily
  anchor an otherwise unanchored object to the root set. (eg, have an
  array of objects associated with the interpreter that are all
  considered to be part of the root set.) This has pretty much the same
  advantages and disadvantages of explicit neonate flag setting:
  
   + Simple
   + Fast DOD
   - Slow for unanchored temporaries
   - Sometimes slow for anchored objects (depending on whether they need
     to be temporarily anchored before the final anchoring)
   - Easy to forget to remove temporaries from the root set
   - Easy to double-anchor objects and forget to remove the temporary
     anchoring
   - longjmp() can bypass the unanchoring
  
  Many of the same or similar variations also apply: objects could be
  automatically removed from the temporary anchoring at generation
  boundaries, etc.
  
  =head1 REFERENCES
  
  =over 4
  
  =item What is neonate?
  
  http://groups.google.com/groups?th=468fc4aebca262f7
  
  Brent Dax's better description of the problem than I have here
  
  =item Mike Lambert proposing Variant 1
  
  http://groups.google.com/groups?th=b2c1aebf64d6ed9a
  
  This also has some macro-heavy proposals that I ignored.
  
  =item Leopold Toetsch proposing Variant 3
  
  http://groups.google.com/groups?th=dc51f11f441bc7d0
  
  Also includes Steve Fink proposing Variant 1
  
  =item Dan Sugalski proposing Variant 3
  
  http://groups.google.com/groups?th=da3012ceb99bab3c
  
  =item Peter Gibbs implementing Variant 4 and getting shot down
  
  http://groups.google.com/groups?th=d2cd475367fc81aa
  
  =item General discussion kicked off by this document
  
  http://groups.google.com/groups?th=66fe6f12e11a5f8d
  
  =item Dan thinks the stackwalk is unavoidable
  
  http://groups.google.com/groups?th=f7e270609ef93161
  
  =item Infant mortality pain
  
  http://groups.google.com/groups?th=ad045a1baeba0c9a
  
  This is a good thread for illustrating the pain that infant mortality
  causes -- in the context of Parrot, I mean.
  
  =item Numbers!
  
  http://groups.google.com/groups?th=d7cd4ca31dcb4414
  
  Gives some benchmark numbers for different approaches.
  
  =item Generational stuff
  
  http://groups.google.com/groups?th=808f38c656a49806
  
  Early discussion that has some stuff I didn't go over here. Mostly
  involves generational schemes.
  
  =back
  
  =head1 CHANGES
  
  2002-Dec-30: Initial Version by Steve Fink
  
  
  


Reply via email to