> -----Original Message----- > From: [email protected] [mailto:pypy-dev- > [email protected]] On Behalf Of Paolo Giarrusso > Sent: 24 December 2010 11:39 > To: Dima Tisnek > Cc: PyPy Dev; Armin Rigo > Subject: Re: [pypy-dev] pypy GC on large objects Re: > funding/popularity? > > On Thu, Dec 23, 2010 at 20:30, Dima Tisnek <[email protected]> wrote: > > Basically collecting this is hard: > > > > dict(a=range(9**9)) > > > > large list is referenced, the object that holds the only reference is > > small no matter how you look at it. > First, usually (in most GC-ed languages) you can collect the list > before the dict. In PyPy, if finalizers are involved (is this the case > here? That'd be surprising), this is no more true. > > However, object size is not the point. For standard algorithms, the > size of an object does not matter at all in deciding when it's > collected - I already discussed this in my other email in this thread, > and I noted what actually could happen in the examples described by > Armin, and your examples show that it is a good property. A large > object in the same heap can fill it up and trigger an earlier garbage > collection. > > In general, if GC ran in the background (but it usually doesn't, and > not in PyPy) it could make sense to free objects sooner or later, > depending not on object size, but on "how much memory would be > 'indirectly freed' by freeing this object". However, because of > sharing, answering this question is too complex (it requires > collecting data from the whole heap). Moreover, the whole thing makes > no sense at all with usual, stop-the-world collectors: the app is > stopped, then the whole young generation, or the whole heap, is > collected, then the app is resumed. > > When separate heaps are involved (such as with ctypes, or with Large > Object Spaces, which avoid using a copy collector for large objects), > it is more complicated to ensure that the same property holds: you > need to consider stats of all heaps to decide whether to trigger GC. > > > I guess it gets harder still if there are many small live objects, as > > getting to this dict takes a while > > (easier in this simple case with generataional collector, O(n) in > general case) > > Not sure what you mean; I can make sense of it (not fully) only with > an incremental collector, and they are still used seldom (especially, > not in PyPy). > > Best regards > > > On 23 December 2010 06:38, Armin Rigo <[email protected]> wrote: > >> Hi René, > >> > >> On Thu, Dec 23, 2010 at 2:33 PM, René Dudfield <[email protected]> > wrote: > >>> I think this is a case where the object returned by > >>> ctypes.create_string_buffer() could use a correct __sizeof__ method > >>> return value. If pypy supported that, then the GC's could support > >>> extensions, and 'opaque' data structures in C too a little more > >>> nicely. > >> > >> I think you are confusing levels. There is no way the GC can call > >> some app-level Python method to get information about the objects it > >> frees (and when would it even call it?). Remember that our GC is > >> written at a level where it works for any interpreter for any > >> language, not just Python. > >>
.NET supports calls to GC.AddMemoryPressure and GC.RemoveMemoryPressure to inform the GC you are allocating things outside of its knowledge. Maybe something similar would help? Cheers, Ben > >> > >> A bientôt, > >> > >> Armin. > >> _______________________________________________ > >> [email protected] > >> http://codespeak.net/mailman/listinfo/pypy-dev > >> > > _______________________________________________ > > [email protected] > > http://codespeak.net/mailman/listinfo/pypy-dev > > > > -- > Paolo Giarrusso - Ph.D. Student > http://www.informatik.uni-marburg.de/~pgiarrusso/ > _______________________________________________ > [email protected] > http://codespeak.net/mailman/listinfo/pypy-dev _______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
