Antoine Pitrou added the comment:

Le 25/09/2017 à 20:55, Neil Schemenauer a écrit :
> 
> I think the basic idea makes a lot of sense, i.e. have a generation that is 
> never collected.  An alternative way to implement it would be to have an 
> extra generation, e.g. rather than just 0, 1, 2 also have generation 3.  The 
> collection would by default never collect generation 3.  Generation 4 would 
> be equivalent to the frozen generation.  You could still force collection by 
> calling gc.collect(3).

API-wise it would sound better to have a separate gc.collect_frozen()...

Though I think a gc.unfreeze() that moves the frozen generation into the
oldest non-frozen generation would be useful too, at least for testing
and experimentation.

> I think issue 3110 (https://bugs.python.org/issue31105) is also related.  The 
> current GC thresholds are not very good.  I've look at what Go does and the 
> GC collection is based on a relative increase in memory usage.  Python could 
> do perhaps something similar.  The accounting of actual bytes allocated and 
> deallocated is tricky because the *_Del/Free functions don't actually know 
> how much memory is being freed, at least not in a simple way.

Yeah... It's worse than that.  Take for example a bytearray object.  The
basic object (the PyByteArrayObject structure) is quite small.  But it
also has a separately-allocated payload that is deleted whenever
tp_dealloc is called.  The GC isn't aware of that payload.  Worse, the
payload can (and will) change size during the object's lifetime, without
the GC's knowledge about it ever being updated. (*)

IMHO, the only reliable way to use memory footprint to drive the GC
heuristic would be to force all allocations into our own allocator, and
reconcile the GC with that allocator (instead of having the GC be its
own separate thing as is the case nowadays).

(*) And let's not talk about hairier cases, such as having multiple
memoryviews over the same very large object...

PS: every heuristic has its flaws.  As I noted on python-(dev|ideas),
full GC runtimes such as most Java implementations are well-known for
requiring careful tuning of GC parameters for "non-usual" workloads.  At
least reference counting makes CPython more robust in many cases.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31558>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to