On Thu, Dec 23, 2010 at 20:30, Dima Tisnek <[email protected]> wrote: > Basically collecting this is hard: > > dict(a=range(9**9)) > > large list is referenced, the object that holds the only reference is > small no matter how you look at it. First, usually (in most GC-ed languages) you can collect the list before the dict. In PyPy, if finalizers are involved (is this the case here? That'd be surprising), this is no more true.
However, object size is not the point. For standard algorithms, the size of an object does not matter at all in deciding when it's collected - I already discussed this in my other email in this thread, and I noted what actually could happen in the examples described by Armin, and your examples show that it is a good property. A large object in the same heap can fill it up and trigger an earlier garbage collection. In general, if GC ran in the background (but it usually doesn't, and not in PyPy) it could make sense to free objects sooner or later, depending not on object size, but on "how much memory would be 'indirectly freed' by freeing this object". However, because of sharing, answering this question is too complex (it requires collecting data from the whole heap). Moreover, the whole thing makes no sense at all with usual, stop-the-world collectors: the app is stopped, then the whole young generation, or the whole heap, is collected, then the app is resumed. When separate heaps are involved (such as with ctypes, or with Large Object Spaces, which avoid using a copy collector for large objects), it is more complicated to ensure that the same property holds: you need to consider stats of all heaps to decide whether to trigger GC. > I guess it gets harder still if there are many small live objects, as > getting to this dict takes a while > (easier in this simple case with generataional collector, O(n) in general > case) Not sure what you mean; I can make sense of it (not fully) only with an incremental collector, and they are still used seldom (especially, not in PyPy). Best regards > On 23 December 2010 06:38, Armin Rigo <[email protected]> wrote: >> Hi René, >> >> On Thu, Dec 23, 2010 at 2:33 PM, René Dudfield <[email protected]> wrote: >>> I think this is a case where the object returned by >>> ctypes.create_string_buffer() could use a correct __sizeof__ method >>> return value. If pypy supported that, then the GC's could support >>> extensions, and 'opaque' data structures in C too a little more >>> nicely. >> >> I think you are confusing levels. There is no way the GC can call >> some app-level Python method to get information about the objects it >> frees (and when would it even call it?). Remember that our GC is >> written at a level where it works for any interpreter for any >> language, not just Python. >> >> >> A bientôt, >> >> Armin. >> _______________________________________________ >> [email protected] >> http://codespeak.net/mailman/listinfo/pypy-dev >> > _______________________________________________ > [email protected] > http://codespeak.net/mailman/listinfo/pypy-dev -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ _______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
