CF, Thanks for helping me figure this out, I definitely thought the bytecode compilation would be amortized since I set repeat to 1, guess there's more to getting useful benchmarks out of Python than I originally thought.
You can use my code for a timeit PSA, hopefully it helps someone else avoid my pitfall. -Jeremy On Fri, Sep 20, 2024 at 5:26 AM CF Bolz-Tereick <cfb...@gmx.de> wrote: > > Hello Jeremy, > > ouch. it turns out the problem is neither your code, nor PyPy, but > timeit. The timeit module is just not a safe benchmarking tool unless > you use it in *exactly* the right way. The way that timeit runs code is > using an exec'ed string, eg f"recurse({i})" in your case. This > bytecode-compiles some code for every time you call timeit.repeat. This > means that the JIT has to compile that bytecode to machine code for > every call to repeat. That is the overhead that you were measuring. > > So your code: > > for i in range(1, 4000): > times["iter"][i] = repeat( > stmt = f"iterate({i})", > ... > > simply never warms up. > > To fix this, you could write your own timing helper. I did that (I'm > attaching my version of the script). > > My results (timing all the 4000 numbers together, per implementation) > are like this: > > CPython 3.11.6: > iter 1.976924208 > rcsv 2.300890293 > > PyPy3 7.3.17: > iter 0.265697421 (7.4x faster) > rcsv 0.643804392 (3.6x faster) > > I also tried to use a list and "".join, a bytearray, and a pypy-specific > stringbuilder, all were slower. Yes, your string additions are > quadratic. But the longest string that is being built is 15, and doing > 1+2+3+...+15=107 characters copies is still very cheap. > > My main takeaway would be that micro-benchmarking is very hard :-(. > > Would it be ok if I use your code in a small PSA blog post about timeit? > > Cheers, > > CF > > > > On 2024-09-20 04:16, Jeremy Brown wrote: > > Manuel, > > > >> Your code is more similar to the code example that shows a case where > >> the optimization doesn’t work. > >> > >> Does that explain the behavior you’re seeing? Feel free to make > >> suggestions on how to clarify the wording. > > > > I think I just interpreted that statement incorrectly, I thought the > > linear code section explained how the JIT optimized the code, not the > > preconditions needed for the JIT to function correctly. > > > > CF, > > > >> Hi Jeremy, > >> > >> Can you share how you are running this function? I tried a few variants, > >> and pypy is often faster than CPython on my attempts, so the rest of your > >> code is necessary to find out what's wrong. > > > > Sure, here's a link to a copy of the benchmark: > > https://pastebin.com/fR0C6qcB. > > > > I also know the JIT definitely can't optimize the iterative version, > > but even if this is a problem specific to PyPy I wouldn't expect runs > > to take twice as long compared to CPython. > > > > -Jeremy > > _______________________________________________ > > pypy-dev mailing list -- pypy-dev@python.org > > To unsubscribe send an email to pypy-dev-le...@python.org > > https://mail.python.org/mailman3/lists/pypy-dev.python.org/ > > Member address: cfb...@gmx.de _______________________________________________ pypy-dev mailing list -- pypy-dev@python.org To unsubscribe send an email to pypy-dev-le...@python.org https://mail.python.org/mailman3/lists/pypy-dev.python.org/ Member address: arch...@mail-archive.com