[issue23693] timeit accuracy could be better

2016-11-02 Thread STINNER Victor

STINNER Victor added the comment:

I wrote a whole new project "perf" to fix root issues of this issue. It 
includes a timeit command. I suggest you to use "perf timeit" rather than 
"timeit" because perf is more reliable:
http://perf.readthedocs.io/en/latest/cli.html#timeit

--
resolution:  -> third party
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23693] timeit accuracy could be better

2016-06-10 Thread Guido van Rossum

Changes by Guido van Rossum :


--
nosy:  -gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23693] timeit accuracy could be better

2016-06-10 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
nosy: +gvanrossum, tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23693] timeit accuracy could be better

2016-06-09 Thread STINNER Victor

STINNER Victor added the comment:

Hi,

I develop a new implementation of timeit which should be more reliable:
http://perf.readthedocs.io/en/latest/

* Run 25 processes instead of just 1
* Compute average and standard deviation rather than the minimum
* Don't disable the garbage collector
* Skip the first timing to "warmup" the benchmark

Using the minimum and disable the garbage collector is a bad practice, it is 
not reliable:

* multiple processes are need to test different random hash functions, since 
Python hash function is now randomized by default in Python 3
* Linux also randomizes the address space by default (ASLR) and so the exact 
timing of memory accesses is different in each process

My following blog post "My journey to stable benchmark, part 3 (average)" 
explains in depth the multiple issues of using the minimum:
https://haypo.github.io/journey-to-stable-benchmark-average.html

My perf module is very yound, it's still a work-in-progress. It should be 
better than timeit right now. It works on Python 2.7 and 3 (I tested 3.4).

We may pick the best ideas into the timeit module.

See also my article explaining how to tune Linux to reduce the "noise" of the 
operating system on microbenchmarks:
https://haypo.github.io/journey-to-stable-benchmark-system.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23693] timeit accuracy could be better

2015-03-17 Thread Robert Collins

New submission from Robert Collins:

In #6422 Haypo suggested making the timeit reports much better. This is a new 
ticket just for that. See 
https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py and 
http://bugs.python.org/issue6422?@ok_message=issue%206422%20nosy%2C%20nosy_count%2C%20stage%20edited%20ok@template=item#msg164216

--
components: Library (Lib)
messages: 238353
nosy: haypo, rbcollins
priority: normal
severity: normal
stage: needs patch
status: open
title: timeit accuracy could be better
type: enhancement
versions: Python 3.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23693
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23693] timeit accuracy could be better

2015-03-17 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

See also issue21988.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23693
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23693] timeit accuracy could be better

2015-03-17 Thread STINNER Victor

STINNER Victor added the comment:

Not only I'm too lazy to compute manually the number of loops and repeat, but 
also I don't trust myself. It's even worse when someone publishs results of a 
micro-benchmark. I don't trust how the benchmark was calibrated. In my 
experience, micro-benchmark are polluted by noise in timings, so results are 
not reliable. 

benchmarks.py calibration is based on time, whereas timeit uses hardcoded 
constants (loops=100, repeat=3) which can be modified on the command line.

benchmarks.py has 3 main parameters:

- minimum duration of a single run (--min-time): 100 ms by default
- maximum total duration of the benchmark: benchmark.py does its best to 
respect this duration, but it can be longer: 1 second by default
- minimum repeat: 5 by default

The minimum duration is increased if the clock resolution is bad (1 ms or 
more). It's the case on Windows for time.clock() on Python 2 for example. 
Extract of benchmark.py:

min_time = max(self.config.min_time, timer_precision * 100)

The estimation of the number of loops is not reliable, but it's written to be 
fast. Since I run a micro-benchmark many times, I don't want to wait too 
long. It's not a power of 10, but an arbitrary integer number. Usually, when 
running benchmark.py multiple times, the number of loops is different each 
time. It's not really a big issue, but it probably makes results more difficult 
to compare.

My constrain is max_time. The tested function may not have a linear duration 
(time = time_one_iteration * loops).

https://bitbucket.org/haypo/misc/src/348bfd6108e9985b3c2298d2745eb5ddfe7042e6/python/benchmark.py?at=default#cl-416

Repeat a test at least 5 times is a compromise between the stability of the 
result and the total duration of the benchmark.

Feel free to reuse my code to enhance time.py.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23693
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com