Author: Antonio Cuni <anto.c...@gmail.com>
Branch: gc-hooks
Changeset: r94358:3fe7e9dc4d49
Date: 2018-04-17 11:22 +0200
http://bitbucket.org/pypy/pypy/changeset/3fe7e9dc4d49/

Log:    add docs about GC hooks

diff --git a/pypy/doc/gc_info.rst b/pypy/doc/gc_info.rst
--- a/pypy/doc/gc_info.rst
+++ b/pypy/doc/gc_info.rst
@@ -121,6 +121,160 @@
   alive by GC objects, but not accounted in the GC
 
 
+GC Hooks
+--------
+
+GC hooks are user-defined functions which are called whenever a specific GC
+event occur, and can be used to monitor GC activity and pauses.  You can
+install the hooks by setting the following attributes:
+
+``gc.hook.on_gc_minor``
+    Called whenever a minor collection occurs. It corresponds to
+    ``gc-minor`` sections inside ``PYPYLOG``.
+
+``gc.hook.on_gc_collect_step``
+    Called whenever an incremental step of a major collection occurs. It
+    corresponds to ``gc-collect-step`` sections inside ``PYPYLOG``.
+
+``gc.hook.on_gc_collect``
+    Called after the last incremental step, when a major collection is fully
+    done. It corresponds to ``gc-collect-done`` sections inside ``PYPYLOG``.
+
+To uninstall a hook, simply set the corresponding attribute to ``None``.  To
+install all hooks at once, you can call ``gc.hooks.set(obj)``, which will look
+for methods ``on_gc_*`` on ``obj``.  To uninstall all the hooks at once, you
+can call ``gc.hooks.reset()``.
+
+The functions called by the hooks receive a single ``stats`` argument, which
+contains various statistics about the event.
+
+Note that PyPy cannot call the hooks immediately after a GC event, but it has
+to wait until it reaches a point in which the interpreter is in a known state
+and calling user-defined code is harmless.  It might happen that multiple
+events occur before the hook is invoked: in this case, you can inspect the
+value ``stats.count`` to know how many times the event occured since the last
+time the hook was called.  Similarly, ``stats.duration`` contains the
+**total** time spent by the GC for this specific event since the last time the
+hook was called.
+
+On the other hand, all the other fields of the ``stats`` object are relative
+only to the **last** event of the series.
+
+The attributes for ``GcMinorStats`` are:
+
+``count``
+    The number of minor collections occured since the last hook call.
+
+``duration``
+    The total time spent inside minor collections since the last hook
+    call. See below for more information on the unit.
+
+ ``total_memory_used``
+    The amount of memory used at the end of the minor collection, in
+    bytes. This include the memory used in arenas (for GC-managed memory) and
+    raw-malloced memory (e.g., the content of numpy arrays).
+
+``pinned_objects``
+    the number of pinned objects.
+
+
+The attributes for ``GcCollectStepStats`` are:
+
+``count``, ``duration``
+    See above.
+
+``oldstate``, ``newstate``
+    Integers which indicate the state of the GC before and after the step.
+
+The value of ``oldstate`` and ``newstate`` is one of these constants, defined
+inside ``gc.GcCollectStepStats``: ``STATE_SCANNING``, ``STATE_MARKING``,
+``STATE_SWEEPING``, ``STATE_FINALIZING``.  It is possible to get a string
+representation of it by indexing the ``GC_STATS`` tuple.
+
+
+The attributes for ``GcCollectStats`` are:
+
+``count``
+    See above.
+
+``num_major_collects``
+    The total number of major collections which have been done since the
+    start. Contrarily to ``count``, this is an always-growing counter and it's
+    not reset between invocations.
+
+``arenas_count_before``, ``arenas_count_after``
+    Number of arenas used before and after the major collection.
+
+``arenas_bytes``
+    Total number of bytes used by GC-managed objects.
+
+``rawmalloc_bytes_before``, ``rawmalloc_bytes_after``
+    Total number of bytes used by raw-malloced objects, before and after the
+    major collection.
+
+Note that ``GcCollectStats`` has **not** got a ``duration`` field. This is
+because all the GC work is done inside ``gc-collect-step``:
+``gc-collect-done`` is used only to give additional stats, but doesn't do any
+actual work.
+
+A note about the ``duration`` field: depending on the architecture and
+operating system, PyPy uses different ways to read timestamps, so ``duration``
+is expressed in varying units. It is possible to know which by calling
+``__pypy__.debug_get_timestamp_unit()``, which can be one of the following
+values:
+
+``tsc``
+    The default on ``x86`` machines: timestamps are expressed in CPU ticks, as
+    read by the `Time Stamp Counter`_.
+
+``ns``
+    Timestamps are expressed in nanoseconds.
+
+``QueryPerformanceCounter``
+    On Windows, in case for some reason ``tsc`` is not available: timestamps
+    are read using the win API ``QueryPerformanceCounter()``.
+
+
+Unfortunately, there does not seem to be a reliable standard way for
+converting ``tsc`` ticks into nanoseconds, although in practice on modern CPUs
+it is enough to divide the ticks by the maximum nominal frequency of the CPU.
+For this reason, PyPy gives the raw value, and leaves the job of doing the
+conversion to external libraries.
+
+Here is an example of GC hooks in use::
+
+    import sys
+    import gc
+
+    class MyHooks(object):
+        done = False
+
+        def on_gc_minor(self, stats):
+            print 'gc-minor:        count = %02d, duration = %d' % 
(stats.count,
+                                                                    
stats.duration)
+
+        def on_gc_collect_step(self, stats):
+            old = gc.GcCollectStepStats.GC_STATES[stats.oldstate]
+            new = gc.GcCollectStepStats.GC_STATES[stats.newstate]
+            print 'gc-collect-step: %s --> %s' % (old, new)
+            print '                 count = %02d, duration = %d' % 
(stats.count,
+                                                                    
stats.duration)
+
+        def on_gc_collect(self, stats):
+            print 'gc-collect-done: count = %02d' % stats.count
+            self.done = True
+
+    hooks = MyHooks()
+    gc.hooks.set(hooks)
+
+    # simulate some GC activity
+    lst = []
+    while not hooks.done:
+        lst = [lst, 1, 2, 3]
+
+
+.. _`Time Stamp Counter`: https://en.wikipedia.org/wiki/Time_Stamp_Counter    
+    
 .. _minimark-environment-variables:
 
 Environment variables
_______________________________________________
pypy-commit mailing list
pypy-commit@python.org
https://mail.python.org/mailman/listinfo/pypy-commit

Reply via email to