New submission from Łukasz Langa:

When you're forking many worker processes off of a parent process, the 
resulting children are initially very cheap in memory.  They share memory pages 
with the base process until a write happens [1]_.

Sadly, the garbage collector in Python touches every object's PyGC_Head during 
a collection, even if that object stays alive, undoing all the copy-on-write 
wins.  Instagram disabled the GC completely for this reason [2]_.  This fixed 
the COW issue but made the processes more vulnerable to memory growth due to 
new cycles being silently introduced when the application code is changed by 
developers.  While we could fix the most glaring cases, it was hard to keep the 
memory usage at bay.  We came up with a different solution that fixes both 
issues.  It requires a new API to be added to CPython's garbage collector.


gc.freeze()
-----------

As soon as possible in the lifecycle of the parent process we disable the 
garbage collector.  Then we call a new API called `gc.freeze()` to move all 
currently tracked objects to a permanent generation.  They won't be considered 
in further collections.  This is okay since we are assuming that (almost?) all 
of the objects created until that point are module-level and thus useful for 
the entire lifecycle of the child process.

After calling `gc.freeze()` we call fork. Then, the child process is free to 
re-enable the garbage collector.

Why do we need to disable the collector on the parent process as soon as 
possible?  When the GC cleans up memory in the mean time, it leaves space in 
pages for new objects.  Those pages become shared after fork and as soon as the 
child process starts creating its own objects, they will likely be written to 
the shared pages, initiating a lot of copy-on-write activity.

In other words, we're wasting a bit of memory in the shared pages to save a lot 
of memory later (that would otherwise be wasted on copying entire pages after 
forking).


Other attempts
--------------

We also tried moving the GC head to another place in memory.  This creates some 
indirection but cache locality on that segment is great so performance isn't 
really hurt.  However, this change introduces two new pointers (16 bytes) per 
object.  This doesn't sound like a lot but given millions of objects and tens 
of processes per box, this alone can cost hundreds of megabytes per host.  
Memory that we wanted to save in the first place.  So that idea was scrapped.


Attribution
-----------

The original patch is by Zekun Li, with help from Jiahao Li, Matt Page, David 
Callahan, Carl S. Shapiro, and Chenyang Wu.


.. [1] https://en.wikipedia.org/wiki/Copy-on-write
.. [2] 
https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

----------
keywords: needs review
messages: 302780
nosy: haypo, lukasz.langa, nascheme, yselivanov
priority: normal
severity: normal
stage: patch review
status: open
title: gc.freeze() - an API to mark objects as uncollectable
type: resource usage
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31558>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to