If you can get me a version of the interpreter with this change made (I wouldn't know what to change), I can run a very allocation/deallocation-heavy application I have lying around, and get you some good benchmarks.
On Sat, Jun 21, 2008 at 1:23 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: - Show quoted text - > Here is my proposal for making the GC run less often. > The objective is to avoid the quadratic-time behavior > if many objects are created and none of them is garbage. > > The youngest generation remains collected after 700 > allocations, the middle generation after 10 young collections. > Therefore, full GC remains considered every 7000 allocations. > The two youngest generations don't cause quadratic behavior, > as the number of objects in each generation is bounded. > > Currently, full GC is invoked every 10 middle collections. > Under my proposal, 10 middle collections must have passed, > PLUS the number of survivor objects from the middle generation > must exceed 10% of the number of objects in the oldest > generation. > > As a consequence, garbage collection becomes less frequent > as the number of objects on the heap grows, resulting in > an amortized O(N) overhead for garbage collection (under > the access pattern stated above). > > To determine the number of survivor objects, counts are > taken during the collection. Objects deallocated through > reference counting still remain accounted for as survivors > (likewise for determining the size of the oldest generation). > > I made a simulation assuming an initially-empty heap, > and invocations of the collector every M=7000 objects. > The table below is in units of M and shows the number > of objects allocated, and the number of objects inspected > since the start of the simulation, for the old and the > new scheme (the asterisk indicates that a collection > took place; lines with no collection are dropped): > > total old_inspected new_inspected > 10 10* 10* > 20 30* 30* > 30 60* 60* > 40 100* 100* > 50 150* 150* > 60 210* 210* > 70 280* 280* > 80 360* 360* > 90 450* 450* > 100 550* 450 > 102 550 552* > 110 660* 552 > 115 660 667* > 120 780* 667 > 129 780 796* > 130 910* 796 > 140 1050* 796 > ... > 940 44650* 7724 > 942 44650 8666* > 950 45600* 8666 > 960 46560* 8666 > 970 47530* 8666 > 980 48510* 8666 > 990 49500* 8666 > 1000 50500* 8666 > ... > 9990 4995000* 95887 > > As you can see, the old and the new scheme would start > of equally until 100M objects have been allocated. The > table shows how the old scheme grows quadratically, and > the new scheme eventually approaches roughly a factor > of ten between the number of objects and the number of > times that GC considers an object. > > Applications with a small number of objects will see no > change in behavior, also, applications that create > lots of small cycles will continue to see them collected > in the youngest or middle generations. > > Please let me know what you think. > > Regards, > Martin > > P.S. I don't plan to implement this myself, so if you like > the idea, go ahead. > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/adlaiff6%40gmail.com > > -- Cheers, Leif _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com