[issue29782] Use __builtin_clzl for bits_in_digit if available

2020-06-15 Thread STINNER Victor


STINNER Victor  added the comment:

Thanks Niklas Fiekas for your tenacity :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2020-06-15 Thread STINNER Victor


Change by STINNER Victor :


--
resolution:  -> fixed
status: open -> closed
versions: +Python 3.10 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2020-06-15 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 794e7d1ab2d7afe70fe0dd87ca8174ac860413e4 by Niklas Fiekas in 
branch 'master':
bpo-29782: Consolidate _Py_Bit_Length() (GH-20739)
https://github.com/python/cpython/commit/794e7d1ab2d7afe70fe0dd87ca8174ac860413e4


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2020-06-08 Thread STINNER Victor


STINNER Victor  added the comment:

I reopen the issue since PR 20739 was created.

--
resolution: rejected -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2020-06-08 Thread Niklas Fiekas


Change by Niklas Fiekas :


--
pull_requests: +19946
pull_request: https://github.com/python/cpython/pull/20739

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-04-22 Thread STINNER Victor

STINNER Victor added the comment:

I'm in favor of documenting in comments decisions to not micro-optimize
such code. I did something similar in ceval.c for 1+1.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-04-22 Thread Louie Lu

Louie Lu added the comment:

If this issue is closed by "not a big performance improvement", maybe the XXX 
in mathmoudle.c should be take off?

"""
/* XXX: This routine does more or less the same thing as
 * bits_in_digit() in Objects/longobject.c.  Someday it would be nice to
 * consolidate them.  On BSD, there's a library function called fls()
 * that we could use, and GCC provides __builtin_clz().
 */
"""

--
nosy: +louielu

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-04-22 Thread Mark Dickinson

Mark Dickinson added the comment:

Closing.

--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-04-21 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

But even 15% speed up in particular microbenchmarks looks too small to me for 
such complex change.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-04-21 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

In some microbenchmarks this can give up to 15%.

$ ./python.patched -m perf timeit -q --compare-to=./python.default -s "a = 
list(map(float, range(1)))" "12345 in a"
Mean +- std dev: [python.default] 1.28 ms +- 0.11 ms -> [python.patched] 1.12 
ms +- 0.07 ms: 1.15x faster (-13%)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-04-21 Thread STINNER Victor

STINNER Victor added the comment:

I concur with Antoine and suggest to reject this issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-04-01 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I don't think a 3% improvement on a micro-benchmark is worth it (this will 
translate into 0% improvement on real-world workloads).  Are there other uses 
of _Py_bit_length() that show bigger benefits?

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

There are not good reasons to optimize this case at compile time. This is very 
obscure way of writing True.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-27 Thread STINNER Victor

STINNER Victor added the comment:

> $ ./python -m timeit "12345678 == 12345678.0"
> 500 loops, best of 5: 40 nsec per loop

By the way, I check the bytecode to make sure that the compiler doesn't 
optimize that. I'm suprised that it's not replaced with True!

Is there a reason to perform the test at runtime?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-27 Thread STINNER Victor

STINNER Victor added the comment:

The change is an optimization, so it requires a benchmark, think you provided. 
Ok. But the speedup is a few nanoseconds on a function which takes currently 40 
nanoseconds. It's not really what I would call significant:

$ ./python -m perf compare_to ref.json patch.json  --table
+-+-+-+
| Benchmark   | ref | patch   |
+=+=+=+
| int==float 8 digits | 39.2 ns | 37.9 ns: 1.03x faster (-3%) |
+-+-+-+
| int==float 1 digit  | 38.0 ns | 37.9 ns: 1.00x faster (-0%) |
+-+-+-+
| int.bit_length()| 42.0 ns | 41.2 ns: 1.02x faster (-2%) |
+-+-+-+

(See attached bench_bit_length.py script.)

So I'm not really convinced that the change is useful. Is bit_length() used in 
hot loops?

--
nosy: +haypo
Added file: http://bugs.python.org/file46759/bench_bit_length.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Niklas Fiekas

Niklas Fiekas added the comment:

Oops. Actually clz should commonly be enough. And platforms where clz and clzl 
are different (<=> unsigned int and unsigned long are different) should be rare.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Niklas Fiekas

Niklas Fiekas added the comment:

Thanks for the review.

I pushed a change to check if clz can be used (`sizeof(digit) <= 
sizeof(unsigned int)`). Otherwise use clzl. I believe the latter should be the 
most common, since unsigned long has 32bits. As you say unsigned long long 
should never be needed.

Btw. mathmodule.c currently duplicates the function: 
https://github.com/python/cpython/blob/master/Modules/mathmodule.c#L1317. It 
might be worth factoring it out, but I don't know where to put it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Mark Dickinson

Mark Dickinson added the comment:

True enough. It looks as though someone (*cough*) already documented that 
restriction, too: 
https://github.com/mdickinson/cpython/blob/v3.6.0/Include/longintrepr.h#L28-L30

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I think we can assume that digit is no larger than unsigned long, otherwise 
PyLong_AsLong() and like wouldn't work.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Mark Dickinson

Mark Dickinson added the comment:

Thanks for the idea, and the PR!

To be useful, this would need a bit of tweaking: we can't assume that a digit 
is always `unsigned long` (in fact, I'd expect that to be rare on 64-bit 
non-Windows systems, where `unsigned long` typically has 64 bits, and `digit` 
should be using a 32-bit type), so we'd need to identify and use the most 
appropriate variant of clz/clzl/clzll somehow.

I think it would also make sense to move the detection of the existence of 
`__builtin_clz` and friends into the configuration machinery: these builtins 
aren't just restricted to GCC (clang supports them as well), and that would 
allow other platforms to provide their own substitutes by modifying `pyport.h`. 
Ideally, the configuration machinery #defines a HAVE_CLZ variable, and then in 
longobject.c we do an #ifdef HAVE_CLZ ...

We also need to trade-off the additional complication in the implementation 
against the benefits: though we do certainly care about the speed, speed at all 
costs isn't the target here; readability, portability and maintainability of 
the source counts, too. It'll probably be easier to weigh those factors once 
there's an implementation that addresses the above points.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Xiang Zhang

Changes by Xiang Zhang :


--
nosy: +mark.dickinson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Niklas Fiekas

Changes by Niklas Fiekas :


--
pull_requests: +490

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29782] Use __builtin_clzl for bits_in_digit if available

2017-03-10 Thread Niklas Fiekas

New submission from Niklas Fiekas:

Baseline performance (9e6ac83acae):

$ ./python -m timeit "12345678 == 12345678.0"
500 loops, best of 5: 40 nsec per loop
$ ./python -m timeit "1 == 1.0"
1000 loops, best of 5: 38.8 nsec per loop
$ ./python -m timeit "(1234578987654321).bit_length()"
1000 loops, best of 5: 39.4 nsec per loop

Upcoming PR:

$ ./python -m timeit "12345678 == 12345678.0"
1000 loops, best of 5: 34.3 nsec per loop
$ ./python -m timeit "1 == 1.0"
1000 loops, best of 5: 34.4 nsec per loop
$ ./python -m timeit "(1234578987654321).bit_length()"
1000 loops, best of 5: 36.4 nsec per loop

--
components: Interpreter Core
messages: 289353
nosy: niklasf
priority: normal
severity: normal
status: open
title: Use __builtin_clzl for bits_in_digit if available
type: performance
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com