STINNER Victor added the comment:

> Another point: timeit is often used to compare performance between Python 
> versions. By changing the behaviour of timeit in a given Python version, 
> you'll make it more difficult to compare results.

Hum, that's a good argument against my change :-)

So to be able to compare Python 3.5 vs 3.6 or Python 2.7 vs Python 3.6, we need 
to backport somehow the average feature to the timeit module of older Python 
versions. One option would be to put the timeit module on the Python Cheeseshop 
(PyPI). Hum, but there is already such module: my perf module.

A solution would be to redirect users to the perf module in the timeit 
documentation, and maybe also document that timeit results are not reliable?

A different solution would be to add a --python parameter to timeit to run the 
benchmark on a specific Python version (ex: "python3 -m timeit --python=python2 
..."). But this solution is more complex to be developed since we have to make 
the compatible with Python 2.7 and find a reliable way to load it in 
the other tested Python program.

Note: I plan to add a --python parameter in my perf module, but I didn't 
implemented yet. Since my perf module spawn child processes and the perf module 
is a third party module, it is simpler to implement this option.


A more general remark: the timeit is commonly used to compare performances of 
two Python versions. They run timeit twice and then compare manually results. 
But only two numbers are compared. It would be more reliable to compare all 
timings and make sure that the comparison is significant. Again, the perf 
module implements such function:

I didn't implement a full CLI for perf timeit to directly compares two Python 
versions. You have to run timeit twice and store all timings in JSON files and 
then use the "perf compare" command to reload timings and compare them.


Python tracker <>
Python-bugs-list mailing list

Reply via email to