STINNER Victor added the comment:
> Another point: timeit is often used to compare performance between Python
> versions. By changing the behaviour of timeit in a given Python version,
> you'll make it more difficult to compare results.
Hum, that's a good argument against my change :-)
So to be able to compare Python 3.5 vs 3.6 or Python 2.7 vs Python 3.6, we need
to backport somehow the average feature to the timeit module of older Python
versions. One option would be to put the timeit module on the Python Cheeseshop
(PyPI). Hum, but there is already such module: my perf module.
A solution would be to redirect users to the perf module in the timeit
documentation, and maybe also document that timeit results are not reliable?
A different solution would be to add a --python parameter to timeit to run the
benchmark on a specific Python version (ex: "python3 -m timeit --python=python2
..."). But this solution is more complex to be developed since we have to make
the timeit.py compatible with Python 2.7 and find a reliable way to load it in
the other tested Python program.
Note: I plan to add a --python parameter in my perf module, but I didn't
implemented yet. Since my perf module spawn child processes and the perf module
is a third party module, it is simpler to implement this option.
A more general remark: the timeit is commonly used to compare performances of
two Python versions. They run timeit twice and then compare manually results.
But only two numbers are compared. It would be more reliable to compare all
timings and make sure that the comparison is significant. Again, the perf
module implements such function:
I didn't implement a full CLI for perf timeit to directly compares two Python
versions. You have to run timeit twice and store all timings in JSON files and
then use the "perf compare" command to reload timings and compare them.
Python tracker <rep...@bugs.python.org>
Python-bugs-list mailing list