[issue43040] random.py randrange() is very slow if the range is a power of 2.

2021-01-27 Thread Donovan Baarda


Donovan Baarda  added the comment:

Some points to note;

I first noticed this as an apparently 5x performance regression for randrange() 
between pypy and pypy3. This seemed pretty significant, but admittedly the 
difference is largely masked by other python overheads when comparing python2 
and python3.

When I mentioned random() is faster, I meant that python2's implementation of 
randrange() that uses random() under the hood was noticeably faster than 
python3's randrange() that uses getrandbits() under the hood.

I know getrandbits() is significantly faster than randrange(), but randrange() 
doesn't have to be that bad.

However, I agree that changing repeatability is potentially a big issue.

--

___
Python tracker 
<https://bugs.python.org/issue43040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43040] random.py randrange() is very slow if the range is a power of 2.

2021-01-27 Thread Donovan Baarda


Donovan Baarda  added the comment:

I had a look at the C implementation of getrandbits() and spotted this;

https://github.com/python/cpython/blob/64fc105b2d2faaeadd1026d2417b83915af6622f/Modules/_randommodule.c#L487

In my case I was also being hit by randrange(2**32) calling getrandbits(33) 
just pushing it past using the optimized fast-path for ranges needing <=32 
bits. So not only was it calling getrandbits() 2x as many times as necessary, 
under the hood it was generating 64 random bits each time using a slow path.

--

___
Python tracker 
<https://bugs.python.org/issue43040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43040] random.py randrange() is very slow if the range is a power of 2.

2021-01-27 Thread Donovan Baarda


Donovan Baarda  added the comment:

FTR, more articles about the slowness of generating random integers in Python;

https://lemire.me/blog/2016/03/21/ranged-random-number-generation-is-slow-in-python/

https://www.reddit.com/r/Python/comments/jn0bb/randomrandint_vs_randomrandom_why_is_one_15x/

--

___
Python tracker 
<https://bugs.python.org/issue43040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43040] random.py randrange() is very slow if the range is a power of 2.

2021-01-27 Thread Donovan Baarda

Donovan Baarda  added the comment:

I first noticed the problem when migrating a program doing lots of 
randrange(2**32) calls from python2 (using pypy -O) to python3 (using pypy3 -O) 
on Debian. The time results were;


pypy -O
real    3m58.621s
user    3m58.501s
sys     0m0.085s

pypy3 -O
real    19m57.337s
user    19m57.011s
sys     0m0.131s

So 5x slower. The execution times for python2 and python3 were too long for me 
to wait for (I'm guessing 30x those numbers). I realize pypy and pypy3 are not 
the same as cpython, but they use the same python random.py module, and after 
fiddling around with profiling it running under all of pypy, pypy3, python2, 
and python3 it became apparent that pypy and pypy3 was effectively reducing my 
program to mostly random() (python2) or getrandombits() (python3) calls from 
under randrange(), and python3 (and pypy3) was calling it 2x as many times. The 
general overheads of python2 and python3 made the speed differences harder to 
notice, as the main bottleneck shifted from getrandombits() to 
general-python-interpreting, but they were still there. So I could get a decent 
comparison between python2 and python3 without waiting till my retirement, I 
changed my program to do a fraction of the total work and timed them again, 
getting;

$ time pypy -O ./chunker.py chunker
real    0m2.315s
user    0m2.246s
sys     0m0.074s

$ time pypy3 -O ./chunker.py chunker
real    0m12.922s
user    0m12.850s
sys     0m0.073s

$ time python2 -O ./chunker.py chunker
real    0m59.631s
user    0m59.620s
sys     0m0.018s

$ time python3 -O ./chunker.py chunker
real    1m2.588s
user    1m2.536s
sys     0m0.037s

Some of the speed difference seems to be a bit of python2's random() is a 
little faster than python3's getrandombits(), and maybe python2 int vs python3 
longint speed differences, but the 2x calls seemed to be the main killer.

I also stumbled onto several blogs talking about Python's random number 
generation being slow, including the following where I first spotted the 
problem;

https://eli.thegreenplace.net/2018/slow-and-fast-methods-for-generating-random-integers-in-python/

So it seems other people have noticed this is slow too. After reading this blog 
I switched to just calling getrandbits(32) directly and the timings went to;

$ time pypy -O ./chunker.py chunker
real    0m4.164s
user    0m4.121s
sys     0m0.049s

$ time pypy3 -O ./chunker.py chunker
real    0m4.786s
user    0m4.714s
sys     0m0.076s

$ time python2 -O ./chunker.py chunker
real    0m44.869s
user    0m44.826s
sys     0m0.044s

$ time python3 -O ./chunker.py chunker
real    0m44.018s
user    0m43.998s
sys     0m0.019s

So changing from randrange(2**32) to getrandbits(32) made pypy 0.55x as fast 
(random() vs getrandbits() under the hood), pypy3 2.7x faster, python2 1.3x 
faster and python3 1.4x faster.

Some of that is bypassing the call-layers between getrandbits() and 
randrange(), and profiling tells me the bit_length() call it skips was also 
pretty expensive, but the 2x getrandbits() bit was definitely most of it.

It is interesting random() used by python2 is a bit cheaper than getrandbits() 
too. Perhaps the default should be _randbelow_without_getrandbits()? I guess it 
has more limitations on range etc.

--

___
Python tracker 
<https://bugs.python.org/issue43040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43040] random.py randrange() is very slow if the range is a power of 2.

2021-01-27 Thread Donovan Baarda


Donovan Baarda  added the comment:

Thanks @mark.dickinson for the heads up on that issue. Comments in the code 
hinted that people had tried to tackle this but IMHO they missed the real fix.

I've submitted #24354 on github. It's missing an addition to the NEWS.d because 
I wasn't sure if something this small required it.

I've signed the CLA and it's just waiting to go through.

--

___
Python tracker 
<https://bugs.python.org/issue43040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43040] random.py randrange() is very slow if the range is a power of 2.

2021-01-27 Thread Donovan Baarda


Change by Donovan Baarda :


--
keywords: +patch
pull_requests: +23174
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/24354

___
Python tracker 
<https://bugs.python.org/issue43040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43040] random.py randrange() is very slow if the range is a power of 2.

2021-01-27 Thread Donovan Baarda


New submission from Donovan Baarda :

I encountered a very significant slowdown migrating some code from python2.7 to 
python3.9 that I tracked down to randrange() being slow. After digging I 
noticed that _randbelow_with_getrandbits() calls getrandbits() 2x as many 
times, and asks for one more bit each time, than is actually required if the 
range requested is a power of 2.

I have a GitHub PR on the way to fix this...

--
components: Library (Lib)
messages: 385769
nosy: abo
priority: normal
severity: normal
status: open
title: random.py randrange() is very slow if the range is a power of 2.
type: performance
versions: Python 3.10, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue43040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1767370] Make xmlrpc use HTTP/1.1 and keepalive

2008-07-24 Thread Donovan Baarda

Donovan Baarda [EMAIL PROTECTED] added the comment:

On Tue, July 22, 2008 05:21, Martin v. Löwis wrote:

 Martin v. Löwis [EMAIL PROTECTED] added the comment:

 We would need the copyright holder of the patch to submit a contributor
 form. Would that be possible?

he works for Google ([EMAIL PROTECTED]), and so do I ([EMAIL PROTECTED])... I
think Google has some sort of company wide contributor agreement, and I
faintly recall signing something before I started working at Google...

krabbe also has a more recent version of this patch that we both keep
failing to submit..

Please let me know if he still needs to sign something, and I'll try to
get him to submit his most recent version of this...

 --
 nosy: +loewis

 ___
 Python tracker [EMAIL PROTECTED]
 http://bugs.python.org/issue1767370
 ___


___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1767370
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1767370] Make xmlrpc use HTTP/1.1 and keepalive

2008-05-15 Thread Donovan Baarda

Donovan Baarda [EMAIL PROTECTED] added the comment:

One more time... this time after adding correct email addresses to my 
existing account...

Martín Conte Mac Donell wrote:
 Martín Conte Mac Donell [EMAIL PROTECTED] added the comment:
 
 I made this patch works against trunk, also i'v fixed some typos.

Sorry for not responding to this earlier...

I didn't write this patch, but submitted it on behalf of the original
author (Cc'ed on this email). I spoke to him about getting this patch
updated, and he said that he has made some more improvements and fixes
since I submitted it.

I will try to get him to produce a new patch, but he (and I) are both
very busy on other things, so I suspect you guys will be able to polish
  up what is already there before we manage to pull something together...

I will give him another poke tomorrow...

 --
 nosy: +Reflejo
 versions: +Python 2.5
 Added file: http://bugs.python.org/file10313/xmlrpc-keepalive.diff
 
 _
 Tracker [EMAIL PROTECTED]
 http://bugs.python.org/issue1767370
 _

_
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1767370
_
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com