[issue15369] pybench and test.pystone poorly documented

2017-03-31 Thread Donald Stufft

Changes by Donald Stufft :


--
pull_requests: +938

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-10-18 Thread STINNER Victor

STINNER Victor added the comment:

I'm closing the issue again.

Again, pybench moved to http://github.com/python/performance : please continue 
the discussion there if you consider that we still need to do something on 
pybench.

FYI I reworked deeply pybench recently using the new perf 0.8 API. perf 0.8 now 
supports running multiple benchmarks per script, so pybench was written as only 
a benchmark runner. Comparison between benchmarks can be done using 
performance, or directly using perf (python3 -m perf compare a.json b.json).

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-15 Thread STINNER Victor

STINNER Victor added the comment:

2016-09-15 11:21 GMT+02:00 Marc-Andre Lemburg :
> I think we are talking about different things here: calibration is
> pybench means that you try to determine the overhead of the
> outer loop and possible setup code that is needed to run the
> the test.
> (...)
> It then takes the minimum timing from overhead runs and uses
> this as base line for the actual test runs (it subtracts the
> overhead timing from the test run results).

Calibration in perf means computing automatically the number of
outer-loops to get a sample of at least 100 ms (default min time).

I simply removed the code to estimate the overhead of the outer loop
in pybench. The reason is this line:

# Get calibration
min_overhead = min(self.overhead_times)

This is no such "minimum timing", it doesn't exist :-) In benchmarks,
you have to work on statistics: use average, standard deviation, etc.

If you badly estimate the minimum overhead, you might get negative
timings, which is not allowed in perf (even zero is an hard error in
perf).

It's not possible to compute *exactly* the "minimum overhead".

Moreover, removing the code to estimate the overhead simplified the code.

> Benchmarking these days appears to have gotten harder not simpler compared to 
> the days of pybench some 19 years ago.

Benchmarking was always a hard problem. Modern hardware (Out of order
CPU, variable CPU frequency, power saving, etc.) problably didn't help
:-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [issue15369] pybench and test.pystone poorly documented

2016-09-15 Thread M.-A. Lemburg
On 15.09.2016 11:11, STINNER Victor wrote:
> 
> STINNER Victor added the comment:
> 
> Hum, since the discussion restarted, I reopen the issue ...
> 
> "Well, pybench is not just one benchmark, it's a whole collection of 
> benchmarks for various different aspects of the CPython VM and per concept it 
> tries to calibrate itself per benchmark, since each benchmark has different 
> overhead."
> 
> In the performance module, you now get individual timing for each pybench 
> benchmark, instead of an overall total which was less useful.

pybench had the same intention. It was a design mistake to add an
overall timing to each suite run. The original intention was to
compare each benchmark individually.

Perhaps it would make sense to try to port the individual benchmark
tests in pybench to performance.

> "The number of iterations per benchmark will not change between runs, since 
> this number is fixed in each benchmark."
> 
> Please take a look at the new performance module, it has a different design. 
> Calibration is based on minimum time per sample, no more on hardcoded things. 
> I modified all benchmarks, not only pybench.

I think we are talking about different things here: calibration is
pybench means that you try to determine the overhead of the
outer loop and possible setup code that is needed to run the
the test.

pybench runs a calibration method which has the same
code as the main test, but without the actual operations that you
want to test, in order to determine the timing of the overhead.

It then takes the minimum timing from overhead runs and uses
this as base line for the actual test runs (it subtracts the
overhead timing from the test run results).

This may not be ideal in all cases, but it's the closest
I could get to timing of the test operations at the time.

I'll have a look at what performance does.

> "BTW: Why would you want to run benchmarks in child processes and in parallel 
> ?"
> 
> Child processes are run sequentially.

Ah, ok.

> Running benchmarks in multiple processes help to get more reliable 
> benchmarks. Read my article if you want to learn more about the design of my 
> perf module:
> http://haypo-notes.readthedocs.io/microbenchmark.html#my-articles

Will do, thanks.

> "Ideally, the pybench process should be the only CPU intense work load on the 
> entire CPU to get reasonable results."
> 
> The perf module automatically uses isolated CPU. It strongly suggests to use 
> this amazing Linux feature to run benchmarks!
> https://haypo.github.io/journey-to-stable-benchmark-system.html
> 
> I started to write advices to get stable benchmarks:
> https://github.com/python/performance#how-to-get-stable-benchmarks
> 
> Note: See also the https://mail.python.org/mailman/listinfo/speed mailing 
> list ;-)

I've read some of your blog posts and articles on the subject
and your journey. Interesting stuff, definitely. Benchmarking
these days appears to have gotten harder not simpler compared to
the days of pybench some 19 years ago.

-- 
Marc-Andre Lemburg
eGenix.com

___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-15 Thread STINNER Victor

STINNER Victor added the comment:

Hum, since the discussion restarted, I reopen the issue ...

"Well, pybench is not just one benchmark, it's a whole collection of benchmarks 
for various different aspects of the CPython VM and per concept it tries to 
calibrate itself per benchmark, since each benchmark has different overhead."

In the performance module, you now get individual timing for each pybench 
benchmark, instead of an overall total which was less useful.


"The number of iterations per benchmark will not change between runs, since 
this number is fixed in each benchmark."

Please take a look at the new performance module, it has a different design. 
Calibration is based on minimum time per sample, no more on hardcoded things. I 
modified all benchmarks, not only pybench.


"BTW: Why would you want to run benchmarks in child processes and in parallel ?"

Child processes are run sequentially.

Running benchmarks in multiple processes help to get more reliable benchmarks. 
Read my article if you want to learn more about the design of my perf module:
http://haypo-notes.readthedocs.io/microbenchmark.html#my-articles


"Ideally, the pybench process should be the only CPU intense work load on the 
entire CPU to get reasonable results."

The perf module automatically uses isolated CPU. It strongly suggests to use 
this amazing Linux feature to run benchmarks!
https://haypo.github.io/journey-to-stable-benchmark-system.html

I started to write advices to get stable benchmarks:
https://github.com/python/performance#how-to-get-stable-benchmarks

Note: See also the https://mail.python.org/mailman/listinfo/speed mailing list 
;-)

--
resolution: fixed -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-15 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 14.09.2016 15:20, STINNER Victor wrote:
> 
> STINNER Victor added the comment:
> 
>> I'd also like to request that you reword this dismissive line in the 
>> performance package's readme: (...)
> 
> Please report issues of the performance module on its own bug tracker:
> https://github.com/python/performance
> 
> Can you please propose a new description? You might even create a pull
> request ;-)

I'll send a PR.

> Note: I'm not sure that we should keep pybench, this benchmark really
> looks unreliable. But I should still try at least to use the same
> number of iterations for all worker child processes. Currently the
> calibration is done in each child process.

Well, pybench is not just one benchmark, it's a whole collection of
benchmarks for various different aspects of the CPython VM and per
concept it tries to calibrate itself per benchmark, since each
benchmark has different overhead.

The number of iterations per benchmark will not change between
runs, since this number is fixed in each benchmark. These numbers
do need an update, though, since at the time of writing pybench
CPUs were a lot less powerful compare to today.

Here's the comment with the guideline for the number of rounds
to use per benchmark:

# Number of rounds to execute per test run. This should be
# adjusted to a figure that results in a test run-time of between
# 1-2 seconds.
rounds = 10

BTW: Why would you want to run benchmarks in child processes
and in parallel ? This will usually dramatically effect the
results of the benchmark runs. Ideally, the pybench process
should be the only CPU intense work load on the entire CPU
to get reasonable results.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-14 Thread STINNER Victor

STINNER Victor added the comment:

> I'd also like to request that you reword this dismissive line in the 
> performance package's readme: (...)

Please report issues of the performance module on its own bug tracker:
https://github.com/python/performance

Can you please propose a new description? You might even create a pull
request ;-)

Note: I'm not sure that we should keep pybench, this benchmark really
looks unreliable. But I should still try at least to use the same
number of iterations for all worker child processes. Currently the
calibration is done in each child process.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-13 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Please add notes to the Tools/README pointing users to the performance suite.

I'd also like to request that you reword this dismissive line in the 
performance package's readme:

"""
pybench - run the standard Python PyBench benchmark suite. This is considered 
an unreliable, unrepresentative benchmark; do not base decisions off it. It is 
included only for completeness.
"""

I suppose this was taken from the Unladden Swallow list of benchmarks and 
completely misses the point of what pybench is all about: it's a benchmark to 
run performance tests for individual parts of CPython's VM implementation. It 
never was intended to be representative. The main purpose is to be able to tell 
whether an optimization in CPython has an impact on individual areas of the 
interpreter or not.

Thanks.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-13 Thread STINNER Victor

STINNER Victor added the comment:

We now have a good and stable benchmark suite: 
https://github.com/python/performance

I removed pystone and pybench from Python 3.7. Please use performance instead 
of old and not reliable microbenchmarks like pybench or pystone.

--
nosy: +haypo
resolution:  -> fixed
status: open -> closed
versions: +Python 3.7 -Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-13 Thread Roundup Robot

Roundup Robot added the comment:

New changeset e03c1b6830fd by Victor Stinner in branch 'default':
Remove pystone microbenchmark
https://hg.python.org/cpython/rev/e03c1b6830fd

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2016-09-13 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 08a0b75904c6 by Victor Stinner in branch 'default':
Remove pybench microbenchmark
https://hg.python.org/cpython/rev/08a0b75904c6

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2013-02-02 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I don't really think they deserve documenting.

pystones can arguably be a cheap and easy way of comparing performance of 
different systems *using the exact same Python interpreter*. It's the only 
point of running pystones.

As for pybench, it probably had a point when there wasn't anything better, but 
I don't think it has anymore. We have a much better benchmarks suite right now, 
and we also have a couple specialized benchmarks in the tools directory.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2013-02-01 Thread Brett Cannon

Changes by Brett Cannon :


--
nosy:  -brett.cannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2012-07-17 Thread Brett Cannon

Brett Cannon  added the comment:

The Unladen Swallow benchmarks are in no way specific to JITs; it is a set of 
thorough benchmarks for measuring the overall performance of a Python VM.

As for speed.python.org, we know that it is currently not being updated as we 
are waiting for people to have the time to move it forward and replace 
speed.pypy.org for all Python VMs.

--
nosy:  -jnoller

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2012-07-17 Thread Florent Xicluna

Florent Xicluna  added the comment:

Actually, I discovered "python -m test.pystone" during the talk of Mike Müller 
at EuroPython. http://is.gd/fasterpy

Even if they are suboptimal for true benchmarks, they should probably be 
mentioned somewhere.
In the same paragraph, there should be a link to the "Grand Unified Python 
Benchmark Suite" as best practice:

http://hg.python.org/benchmarks
http://hg.python.org/benchmarks/file/tip
http://hg.python.org/benchmarks/file/tip/README.txt

The last paragraph of this wiki page might be reworded and included in the 
Python documentation:
http://code.google.com/p/unladen-swallow/wiki/Benchmarks
http://code.google.com/p/unladen-swallow/wiki/Benchmarks#Benchmarks_we_don't_use



BTW, there's also this website which seems not updated anymore…
http://speed.python.org/

--
nosy: +jnoller

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2012-07-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Brett Cannon wrote:
> 
> Brett Cannon  added the comment:
> 
> I disagree. They are outdated benchmarks and probably should either be 
> removed or left undocumented. Proper testing of performance is with the 
> Unladen Swallow benchmarks.

I disagree with your statement. Just like every benchmark, they serve
their purpose in their particular field of use, e.g. pybench may not
be useful for the JIT approach originally taken by the Unladden Swallow
project, but it's still useful to test/check changes in the non-JIT
CPython interpreter and it's extensible to take new developments
into account. pystone is useful to get a quick feel the performance
of Python on a machine.

--
nosy: +lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2012-07-17 Thread Brett Cannon

Brett Cannon  added the comment:

I disagree. They are outdated benchmarks and probably should either be removed 
or left undocumented. Proper testing of performance is with the Unladen Swallow 
benchmarks.

--
nosy: +brett.cannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15369] pybench and test.pystone poorly documented

2012-07-16 Thread Florent Xicluna

New submission from Florent Xicluna :

The benchmarking tools "pystones" and "pybench" which are shipped with the 
Python standard distribution are not documented.

The only information is in the what's-new for Python 2.5:
http://docs.python.org/dev/whatsnew/2.5.html?highlight=pybench#new-improved-and-removed-modules

IMHO, they should be mentioned somewhere in the HOWTOs, the FAQ or the standard 
library documentation ("Development Tools" or "Debugging and Profiling")

--
assignee: docs@python
components: Benchmarks, Documentation
messages: 165603
nosy: docs@python, flox
priority: normal
severity: normal
status: open
title: pybench and test.pystone poorly documented
type: behavior
versions: Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com