Re: [Python-Dev] Stop using timeit, use perf.timeit!

2016-06-10 Thread Kevin Modzelewski via Python-Dev
Hi all, I wrote a blog post about this.
http://blog.kevmod.com/2016/06/benchmarking-minimum-vs-average/

We can rule out any argument that one (minimum or average) is strictly
better than the other, since there are cases that make either one better.
It comes down to our expectation of the underlying distribution.

Victor if you could calculate the sample skewness
 of your results I
think that would be very interesting!

kmod

On Fri, Jun 10, 2016 at 10:04 AM, Steven D'Aprano 
wrote:

> On Fri, Jun 10, 2016 at 05:07:18PM +0200, Victor Stinner wrote:
> > I started to work on visualisation. IMHO it helps to understand the
> problem.
> >
> > Let's create a large dataset: 500 samples (100 processes x 5 samples):
> > ---
> > $ python3 telco.py --json-file=telco.json -p 100 -n 5
> > ---
> >
> > Attached plot.py script creates an histogram:
> > ---
> > avg: 26.7 ms +- 0.2 ms; min = 26.2 ms
> >
> > 26.1 ms:   1 #
> > 26.2 ms:  12 #
> > 26.3 ms:  34 
> > 26.4 ms:  44 
> > 26.5 ms: 109 ##
> > 26.6 ms: 117 
> > 26.7 ms:  86 ##
> > 26.8 ms:  50 ##
> > 26.9 ms:  32 ###
> > 27.0 ms:  10 
> > 27.1 ms:   3 ##
> > 27.2 ms:   1 #
> > 27.3 ms:   1 #
> >
> > minimum 26.1 ms: 0.2% (1) of 500 samples
> > ---
> [...]
> > The distribution looks a gaussian curve:
> > https://en.wikipedia.org/wiki/Gaussian_function
>
> Lots of distributions look a bit Gaussian, but they can be skewed, or
> truncated, or both. E.g. the average life-span of a lightbulb is
> approximately Gaussian with a central peak at some value (let's say 5000
> hours), but while it is conceivable that you might be really lucky and
> find a bulb that lasts 15000 hours, it isn't possible to find one that
> lasts -1 hours. The distribution is truncated on the left.
>
> To me, your graph looks like the distribution is skewed: the right-hand
> tail (shown at the bottom) is longer than the left-hand tail, six
> buckets compared to five buckets. There are actual statistical tests for
> detecting deviation from Gaussian curves, but I'd have to look them up.
> But as a really quick and dirty test, we can count the number of samples
> on either side of the central peak (the mode):
>
> left: 109+44+34+12+1 = 200
> centre: 117
> right: 500 - 200 - 117 = 183
>
> It certainly looks *close* to Gaussian, but with the crude tests we are
> using, we can't be sure. If you took more and more samples, I would
> expect that the right-hand tail would get longer and longer, but the
> left-hand tail would not.
>
>
> > The interesting thing is that only 1 sample on 500 are in the minimum
> > bucket (26.1 ms). If you say that the performance is 26.1 ms, only
> > 0.2% of your users will be able to reproduce this timing.
>
> Hmmm. Okay, that is a good point. In this case, you're not so much
> reporting your estimate of what the "true speed" of the code snippet
> would be in the absence of all noise, but your estimate of what your
> users should expect to experience "most of the time".
>
> Assuming they have exactly the same hardware, operating system, and load
> on their system as you have.
>
>
> > The average and std dev are 26.7 ms +- 0.2 ms, so numbers 26.5 ms ..
> > 26.9 ms: we got 109+117+86+50+32 samples in this range which gives us
> > 394/500 = 79%.
> >
> > IMHO saying "26.7 ms +- 0.2 ms" (79% of samples) is less a lie than
> > 26.1 ms (0.2%).
>
> I think I understand the point you are making. I'll have to think about
> it some more to decide if I agree with you.
>
> But either way, I think the work you have done on perf is fantastic and
> I think this will be a great tool. I really love the histogram. Can you
> draw a histogram of two functions side-by-side, for comparisons?
>
>
> --
> Steve
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/kmod%40dropbox.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] Daily reference leaks (b78574cb00ab): sum=1120

2016-11-19 Thread Kevin Modzelewski via Python-Dev
Hi Yury, you may be interested in some leak-finding code that wrote for
Pyston.  It uses the GC infrastructure to show you objects that were
directly leaked, ignoring indirect leaks -- ie objects that were only
leaked because they were referenced by a leaked object.  It can often give
you a very small list of objects to look into (depending on how many non-gc
objects were leaked).  If you're interested I can try porting it to CPython.

https://github.com/dropbox/pyston/blob/master/from_cpython/Modules/gcmodule.c#L894

kmod

On Wed, Nov 9, 2016 at 7:16 AM, Yury Selivanov 
wrote:

> I'm trying to fix refleaks in 3.6.  So far:
>
> On 2016-11-09 4:02 AM, solip...@pitrou.net wrote:
>
> results for b78574cb00ab on branch "default"
>> 
>>
>> test_ast leaked [98, 98, 98] references, sum=294
>> test_ast leaked [98, 98, 98] memory blocks, sum=294
>> test_asyncio leaked [3, 0, 0] memory blocks, sum=3
>> test_code leaked [2, 2, 2] references, sum=6
>> test_code leaked [2, 2, 2] memory blocks, sum=6
>> test_functools leaked [0, 3, 1] memory blocks, sum=4
>> test_pydoc leaked [106, 106, 106] references, sum=318
>> test_pydoc leaked [42, 42, 42] memory blocks, sum=126
>> test_trace leaked [12, 12, 12] references, sum=36
>> test_trace leaked [11, 11, 11] memory blocks, sum=33
>>
>>
>>
> test_ast, test_code and test_trace were fixed by
> https://hg.python.org/cpython/rev/2c6825c9ecfd
>
> test_pydoc leaks in test_typing_pydoc. I tried git bisect and it looks
> like that the first commit that introduced the refleak was the one that
> added test_typing_pydoc!
>
> 62127e60e7b0 doesn't modify any CPython internals, so it looks like that
> test_typing_pydoc exposed some bug that has existed before it. Any help
> tracking that down is welcome :)
>
> Yury
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/kmod%
> 40dropbox.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com