Tim Peters wrote: > With current trunk that printed > > [2.9363677646013846, 2.9489729031005703, 2.9689538729183949] > > After changing > > #define MAXSAVEDTUPLES 2000 > > to > > #define MAXSAVEDTUPLES 0 > > the times zoomed to > > [4.5894824930441587, 4.6023111649343242, 4.629560027293957] > > That's pretty dramatic.
Interesting. I ran this through gprof, and found the following changes to the number of function calls with-cache without-cache PyObject_Malloc 59058 24055245 tupletraverse 33574 67863194 visit_decref 131333 197199417 visit_reachable 131333 197199417 collect 17 33006 (for reference:) tuplerepeat 30000000 30000000 According to gprof, these functions (excluding tuplerepeat) together account for 40% of the execution time in the without-cache (i.e. MAXSAVEDTUPLES 0) case. So it appears that much of the slowdown in disabling the fast tuple allocator is due to the higher frequency of garbage collection in your example. Can you please re-run the example with gc disabled? Of course, it's really no surprise that GC is called more often: if the tuples are allocated from the cache, that doesn't count as an allocation wrt. GC. It so happens that your example just triggers gc a few times in its inner loop; I wouldn't attribute that overhead to obmalloc per se. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com