Vitja Makarov, 28.01.2012 21:41: > 2012/1/29 Stefan Behnel: >> Vitja Makarov, 28.01.2012 20:58: >>> 2012/1/28 mark florisson: >>>> On 28 January 2012 19:41, Vitja Makarov wrote: >>>>> 2012/1/28 Stefan Behnel: >>>>>> Here's a general take on a code object cache for exception propagation. >>>>>> >>>>>> https://github.com/scoder/cython/commit/ad18e0208 >>>>>> >>>>>> When I raise an exception in test code that propagates through a Python >>>>>> call hierarchy of four functions before being caught, the cache gives me >>>>>> something like a 2x speedup in total. Not bad. When I do the same for >>>>>> cdef >>>>>> functions, it's more like 4-5x. >>>>>> >>>>>> The main idea is to cache the objects in a reallocable C array and bisect >>>>>> into it based on the C code "__LINE__" of the exception, which should be >>>>>> unique enough for a given module. >>>>>> >>>>>> It's a global cache that doesn't limit the lifetime of code objects >>>>>> (well, >>>>>> up to the lifetime of the module, obviously). I don't know if that's a >>>>>> problem because the number of code objects is only bounded by the number >>>>>> of >>>>>> exception origination points in the C source code, which is usually quite >>>>>> large. However, only a tiny fraction of those will ever raise or >>>>>> propagate >>>>>> an exception in practice, so the real number of cached code objects will >>>>>> be >>>>>> substantially smaller. >>>>>> >>>>>> Maybe thorough test suites with lots of failure testing would notice a >>>>>> difference in memory consumption, even though a single code objects isn't >>>>>> all that large either... >>>>> >>>>> We already have --no-c-in-traceback flag that disables C line numbers >>>>> in traceback. What's about enabling it by default? >>>>> >>>> I'm quite attached to that feature actually :), it would be pretty >>>> annoying to disable that flag every time. And what would disabling >>>> that option gain, as the current code still formats the filename and >>>> function name. >>> >>> It's rather useful for developers or debugging. Most of the people >>> don't need it. >> >> Not untrue. However, at least a majority of developers should be able to >> make use of it when it's there, and code is several times more often built >> for testing and debugging than for production. So I consider it a virtue >> that it's on by default. >> >> >>> Here is simple benchmark: >>> # upstream/master: 6.38ms >>> # upstream/master (no-c-in-traceback): 3.07ms >>> # scoder/master: 1.31ms >>> def foo(): >>> raise ValueError >>> >>> def testit(): >>> cdef int i >>> for i in range(10000): >>> try: >>> foo() >>> except: >>> pass >>> >>> Stefan's branch wins but: >>> - there is only one item in the cache and it's always hit >> >> Even if there were substantially more, binary search is so fast you'd >> hardly notice the difference. > > Yes, I'm a little bit worried about insertions.
I know, that's O(n), but it only strikes when a new exception is raised or propagated from a code line that has never raised an exception before. That makes it *very* unlikely that it hits a performance critical spot. > With --no-c-in-traceback python lineno should be used as a key. Good call, I added that. https://github.com/scoder/cython/commit/8b50da874#diff-0 That means that using this option additionally improves the caching performance now because you get less code objects overall, at most one per Cython source code line (as opposed to C source code line). Stefan _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel