I have a core loop that is critical to performance. The code is at http://www.acooke.org/lepl/api/lepl.parser-pysrc.html#trampoline
If I write a separate "optimised" version of that function for when "monitor" is empty, with all the "if monitor" tests removed, the profiler (cProfile) indicates a 10% reduction in time spent in the loop. But if I run the same code 100 times under timeit, without profiling, I see no difference in total time. The process is CPU bound. I am pretty sure the improvement seen in the prfiler is repeatable and less than the noise in the profiling, so why don't the timeit times change? The only reason I can think of is that the profiler is blocking some kind of optimisation (which seems odd, since that makes profiling somewhat pointless). Full disclosure - the code linked above is not quite identical to my current workspace (I have replaced the append/pop with an explicit index into an array whose size is managed separately). Andrew -- http://mail.python.org/mailman/listinfo/python-list