[issue44931] Add "bidimap" to collections library: a simple bidirectional map

2021-08-17 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

It is part of the Apache Common collections

--

___
Python tracker 
<https://bugs.python.org/issue44931>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44931] Add "bidimap" to collections library: a simple bidirectional map

2021-08-17 Thread Jurjen N.E. Bos


Change by Jurjen N.E. Bos :


--
title: Add "bidimap" to collections library -> Add "bidimap" to collections 
library: a simple bidirectional map

___
Python tracker 
<https://bugs.python.org/issue44931>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44931] Add "bidimap" to collections library

2021-08-17 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

Give me a shout if you like this: I am happy to write a test suite, make a 
patch, etc.

--

___
Python tracker 
<https://bugs.python.org/issue44931>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44931] Add "bidimap" to collections library

2021-08-17 Thread Jurjen N.E. Bos


Change by Jurjen N.E. Bos :


--
type:  -> performance
versions: +Python 3.10

___
Python tracker 
<https://bugs.python.org/issue44931>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44931] Add "bidimap" to collections library

2021-08-17 Thread Jurjen N.E. Bos


New submission from Jurjen N.E. Bos :

The Java class "BiDiMap" is very useful and doesn't seem to have an equivalent 
in the Python libraries.
I wrote a proposed class that does just that.
Here's a simple implementation, that could be used as a starting point.

--
files: bidimap.py
hgrepos: 408
messages: 399710
nosy: jneb
priority: normal
severity: normal
status: open
title: Add "bidimap" to collections library
Added file: https://bugs.python.org/file50225/bidimap.py

___
Python tracker 
<https://bugs.python.org/issue44931>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42911] Addition chains for pow saves 5-20% time for pow(int, int)

2021-01-21 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

...not to mention the new gcd and lcm functions, and the fact that the number 
conversion is linear for exponent-of-two bases, and the negative power modulo a 
prime number!

--

___
Python tracker 
<https://bugs.python.org/issue42911>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42911] Addition chains for pow saves 5-20% time for pow(int, int)

2021-01-21 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

Well, I would argue that there is already quite a work going to for 
crypto-sized computations in the integer code, as well as the crypto-oriented 
.bit_count() function that was recently added.

For starters, the arguably crypto-oriented three argument pow() was there from 
Python 0.1 already, where I used it :-).
There's Karatsuba multiplication, five-ary powering, and quite a few 
optimizations on the speed of the number conversion.

And then of course the incredible implementation of Decimal, which does include 
a subquadratic division. I would say this would fit there.

And maybe I'll make a subquadratic division for ints someday...

Tim, your vote please...

--

___
Python tracker 
<https://bugs.python.org/issue42911>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42911] Addition chains for pow saves 5-20% time for pow(int, int)

2021-01-15 Thread Jurjen N.E. Bos


Change by Jurjen N.E. Bos :


--
title: Addition chains for pow saves 10 % time! -> Addition chains for pow 
saves 5-20% time for pow(int,int)

___
Python tracker 
<https://bugs.python.org/issue42911>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42911] Addition chains for pow saves 10 % time!

2021-01-15 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

Well, measurements show that it saves more time than I thought (sometimes over 
20%!), because there are other ways in which it saves time; I am quite happy 
with that.

In the code I needed functions _Py_bit_length64 and _Py_bit_count64.
I thought these could better move to the bitutils, but I am not sure about a 
good name to use, since there are probably other places where these are used, 
too (I know of at least one in hashtable.c).
The 32 bits versions in bitutils are called _Py_bit_length and _Py_popcount32 
(not the most logical names).
So then it would also be more logical to give all four of these consistent 
names, everywhere. But that's probably better at a later time, right?

--

___
Python tracker 
<https://bugs.python.org/issue42911>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42911] Addition chains for pow saves 10 % time!

2021-01-13 Thread Jurjen N.E. Bos


Change by Jurjen N.E. Bos :


--
keywords: +patch
pull_requests: +23032
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/24206

___
Python tracker 
<https://bugs.python.org/issue42911>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42911] Addition chains for pow saves 10 % time!

2021-01-12 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

Some more information for the interested:
The algorithm I made tries to smoothly change the"chunk size" with growing 
length of the exponent. So the exponents that win the most (up to 14% fewer 
multiplication) are the long exponents that are just shorter than the 
FIVEARY_CUTOFF.
But, I worked hard to make an algorithm that also saves multiplications for 
shorter numbers. Numbers of up to about 20 bits will be using the optimal chunk 
size.
And, of course, the decision must be made quickly because for some frequently 
occurring parameters (e.g., 3**25), the routine doesn't take that long anyway.
This is obtained by checking two properties of the exponent that strongly 
influence the addition chain: the higher four bits, and (surprise!) the number 
of pairs of bits with distance 2: in other words, (n>>2).bit_count().
After days of trying out all kinds of heuristics, and days of crunching,I 
measured the optimal parameters. I added the code I used to do that.
Guido may remember that I wrote a chapter in my Ph.D. on the subject of 
addition chains. The interesting thing is that I then used Python for that too: 
that was almost 30 years ago!
When implementing, I discovered that lots of the code around it had been in 
flux, so I didn't manage to "cherry pick" it into 3.10 yet. (One example: the 
bit_length and bit_count routines were renamed and moved around). And, I don't 
have windows 10:-(
But anyway, have a look and let me hear what you think of it. I'll also want to 
test and measure it a bit more, but I am sure it is quite stable.

--
Added file: https://bugs.python.org/file49738/longobject.py

___
Python tracker 
<https://bugs.python.org/issue42911>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42911] Addition chains for pow saves 10 % time!

2021-01-12 Thread Jurjen N.E. Bos


New submission from Jurjen N.E. Bos :

When looking at the code of pow() with integer exponent, I noticed there is a 
hard boundary between the binary and "fiveary" (actually 32-ary) computations. 
Also, the fiveary wasn't really optimal.

So I wrote a proof of concept version of long_pow that dynamically uses 
addition chains!
It does save over 10 % of multiplications for exponents from 20 to a few 
hundred bits, and then the saving go down to a few percent for very long 
numbers. It does not take much more memory nor time for any argument 
combination I checked.
I tested it on 3.8rc1, but I am planning to port it to 3.10. This is a bit 
difficult, since *lots* of code around it changed, and I only have Windows 7. 
However, I'll keep you posted.
See https://github.com/jneb/cpython/tree/38_fast_pow

--
components: Interpreter Core
files: longobject.c
messages: 384949
nosy: jneb
priority: normal
severity: normal
status: open
title: Addition chains for pow saves 10 % time!
type: performance
versions: Python 3.10
Added file: https://bugs.python.org/file49737/longobject.c

___
Python tracker 
<https://bugs.python.org/issue42911>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42673] Optimize round_size for rehashing

2020-12-29 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

Harsh, but fair.
I'll do a better job next time!

Op di 29 dec. 2020 13:42 schreef Serhiy Storchaka :

>
> Serhiy Storchaka  added the comment:
>
> Since no benchmarking data was provided, I suggest to close this issue. We
> do not accept optimization changes without evidences of performance boost.
>
> --
> status: open -> pending
>
> ___
> Python tracker 
> <https://bugs.python.org/issue42673>
> ___
>

--
status: pending -> open

___
Python tracker 
<https://bugs.python.org/issue42673>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42673] Optimize round_size for rehashing

2020-12-18 Thread Jurjen N.E. Bos


Change by Jurjen N.E. Bos :


--
keywords: +patch
pull_requests: +22692
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/23833

___
Python tracker 
<https://bugs.python.org/issue42673>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42673] Optimize round_size for rehashing

2020-12-18 Thread Jurjen N.E. Bos


New submission from Jurjen N.E. Bos :

There's a trivial optimization in the round_size in hashtable.c:
a loop is used to compute the lowest power of two >= s,
while this can be done in one step with bit_length.
I am making a pull request for this.

--
components: Interpreter Core
messages: 383291
nosy: jneb
priority: normal
severity: normal
status: open
title: Optimize round_size for rehashing
type: performance
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue42673>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24925] Allow doctest to find line number of __test__ strings if formatted as a triple quoted string.

2019-12-31 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

All is fixed. The PR apparently is correct.
My first dent in the Python universe, however small :-)

--

___
Python tracker 
<https://bugs.python.org/issue24925>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24925] Allow doctest to find line number of __test__ strings if formatted as a triple quoted string.

2019-12-10 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

I tried to make a pull request, but it fails on the format of news file name.
At least the tests all pass.

--

___
Python tracker 
<https://bugs.python.org/issue24925>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35880] math.sin has no backward error; this isn't documented

2019-03-08 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

I stand corrected; more on that later.

"backward error" is the mathematical term used for the accuracy of a
function. (Forward error is in the result proper; backward error means that you 
calculate the correct result for a number that is very close to the input.)
Since pi is not a machine representable number, it is pretty hard to
implement the trig functions with a zero backward error, since you need to 
divide by 2*pi in any reasonable implementation. For some reason, I was in the 
impression that the backward error of the sine was zero. 

I wrote a program to demonstrate the matter, only to find out that I was wrong 
:P
Maybe in the 32 bit version, but not in the 64 bits? Anyway, it is more 
implementation dependent than I though.

Althougth the backward error of the builtin sine function isn't zero, it is 
still a cool 21 digits, as the program shows.
- Jurjen

--
resolution:  -> rejected
status: pending -> open
Added file: https://bugs.python.org/file48199/sindemo.py

___
Python tracker 
<https://bugs.python.org/issue35880>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35880] math.sin has no backward error; this isn't documented

2019-02-01 Thread Jurjen N.E. Bos


New submission from Jurjen N.E. Bos :

The documentation of math.sin (and related trig functions) doesn't speak about 
backward error.
In cPython, as far as I can see, there is no backward error at all, which is 
quite uncommon.
This may vary between implementations; many math libraries of other languages 
have a backward error, resulting in large errors for large arguments.
e.g. sin(1<<500) is correctly computed as 0.42925739234242827, where a backward 
error as small as 1e-150 can give a completely wrong result.

Some text could be added (which I am happy to produce) that explains what 
backward error means, and under which circumstances you can expect an accurate 
result.

--
assignee: docs@python
components: Documentation
messages: 334672
nosy: docs@python, jneb
priority: normal
severity: normal
status: open
title: math.sin has no backward error; this isn't documented
versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue35880>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24925] Allow doctest to find line number of __test__ strings if formatted as a triple quoted string.

2019-01-22 Thread Jurjen N.E. Bos


Jurjen N.E. Bos  added the comment:

Yes. That would make me happy. In the meantime I learned how to use git, so 
maybe I'll do the pull request myself next time. Thanks for the work.

--

___
Python tracker 
<https://bugs.python.org/issue24925>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24925] Allow doctest to find line number of __test__ strings if formatted as a triple quoted string.

2018-03-13 Thread Jurjen N.E. Bos

Jurjen N.E. Bos <j...@users.sourceforge.net> added the comment:

Oh wait, I already posted that so long ago I forgot about it.
Oops.
Sorry about the repetition, folks.
- Jurjen

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue24925>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24925] Allow doctest to find line number of __test__ strings if formatted as a triple quoted string.

2018-03-13 Thread Jurjen N.E. Bos

Jurjen N.E. Bos <j...@users.sourceforge.net> added the comment:

I always use the following piece of code that can be trivially used to patch 
the source.
It basically looks for a using line in the test string and finds it in the 
source file. If there is a match, we know where we are. Otherwise it falls back 
to the "normal" way.

--
Added file: https://bugs.python.org/file47482/doctestfix.py

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue24925>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26256] Fast decimalisation and conversion to other bases

2016-02-03 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

That reference you gave says that the binary version is faster than the Python 
version, but here the _complexity_ actually changed by a lot.
Only people who know the library by name will notice that it is this fast.
But you are right, for 99% of the people it doesn't make much of a difference.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26256>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26256] Fast decimalisation and conversion to other bases

2016-02-03 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

OMG is decimal that fast?
Maybe I should change the issue then to "documentation missing": it nowhere 
says in the documentation that decimal has optimized multiprecision 
computations. It only says that precision "can be as large as needed for a 
given problem", but I never realized that that included millions of digits.
It computed to my complete surprise 2**2**29 to full precision (161 million 
decimal digits) in about a minute. That smells suspiciously like FFT style 
multiplication, which implies that it is way more sophisticated than integer 
multiplication!
I suggest that the documentation of the decimal module recommends using decimal 
for multiprecision computations, as long as you use the builtin version.

--
assignee:  -> docs@python
components: +Documentation -Library (Lib)
nosy: +docs@python
versions: +Python 3.4 -Python 2.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26256>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26256] Fast decimalisation and conversion to other bases

2016-02-01 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos:

Inspired by the recently new discovered 49th Mersenne prime number I wrote a 
module to do high speed long/int to decimal conversion. I can now output the 
new Mersenne number in 18.5 minutes (instead of several hours) on my machine.
For numbers longer than about 10 bits, this routine it is faster than 
str(number) thanks to the Karatsuba multiplication in CPython.
The module supports all number bases 2 through 36, and is written in pure 
python (both 2 and 3).
There is a simple way to save more time by reusing the conversion object 
(saving about half the time for later calls).
My suggestion is to incorporate this into some library, since Python still 
lacks a routine to convert to any number base. Ideally, it could be 
incorporated in the builtin str function, but this would need more work. 
When converting to C, it is recommended to optimize bases 4 and 32 the same way 
as oct, hex and bin do (which isn't easily accessible from Python).

Hope you like it. At least, it was a lot of fun to write...

Hope you like it.

--
components: Library (Lib)
files: fastStr.py
messages: 259317
nosy: jneb
priority: normal
severity: normal
status: open
title: Fast decimalisation and conversion to other bases
type: enhancement
versions: Python 2.7
Added file: http://bugs.python.org/file41770/fastStr.py

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26256>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26256] Fast decimalisation and conversion to other bases

2016-02-01 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

Thanks for the tip. I'll consider making it a recipe.
- Jurjen

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26256>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24925] Allow doctest to find line number of __test__ strings if formatted as a triple quoted string.

2015-10-22 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

Oops. Lousy editing, I should have used return instead of break.

These 6 lines can be inserted in lib/doctest at line 1100:
+if isinstance(obj, str) and source_lines is not None:
+# This will find __test__ string doctests if and only if the string
+# contains any unique line.
+for offset,line in enumerate(obj.splitlines(keepends=True):
+if source_lines.count(line)==1:
+return source_lines.index(line)-offset

And it works fine for me; the code is quite general and has predictable 
behaviour: any test string with a unique line in it will work.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24925>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24925] Allow doctest to find line number of __test__ strings if formatted as a triple quoted string.

2015-10-22 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

I am not as good in making nice patches, but that code can be improved upon a 
bit as follows:


+if isinstance(obj, str) and source_lines is not None:
+# This will find __test__ string doctests if and only if the string
+# contains any unique line.
+for offset,line in enumerate(obj.splitlines(keepends=True):
+if source_lines.count(line)==1:
+lineno = source_lines.index(line)-offset
+break

 # We couldn't find the line number.
 return None


I thing this will improve legibility and probably also speed under most 
circumstances.

--
nosy: +jneb

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24925>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25215] Simple extension to iter(): iter() returns empty generator

2015-09-22 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos:

When looking for a "neat" way to create an empty generator, I saw on 
stackOverflow that the crowd wasn't sure what was the "least ugly" way to do it.
Proposals where:
def emptyIter(): return; yield
or
def emptyIter(): return iter([])

Then it struck me that a trivial extension to the iter() built-in would be to 
allow to call it without arguments, thus giving a simple to understand empty 
iterator, and allowing:
def emptyIter(): return iter()
(And, of course, this function would not need to exist in any reasonable 
program in that case.)

The implementation would be trivial, I assume.

--
components: Library (Lib)
messages: 251324
nosy: jneb
priority: normal
severity: normal
status: open
title: Simple extension to iter(): iter() returns empty generator
type: enhancement
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25215>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25049] Strange behavior under doctest: staticmethods have different __globals__

2015-09-10 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos:

I found a pretty obscure bug/documentation issue.
It only happens if you use global variables in static methods under doctest.
The thing is that this code fails under doctest, while anyone would expect it 
to work at first sight:

class foo:
@staticmethod
def bar():
"""
>>> GLOBAL = 5; foo.bar()
5
"""
print(GLOBAL)

The cause of the problem is that the static method has a __globals__ that for 
reasons beyond my understanding is not equal to the globals when run under 
doctest.
This might be a doctest bug, or there might be a reason for it that I don't 
get: than it must be documented, before it stupifies others.
The behaviour is the same under python 2 and 3.
The attached file shows all (written for Python 3, but can be trivially adapted 
for python 2), including the workaround.

--
components: Library (Lib)
files: t.py
messages: 250355
nosy: jneb
priority: normal
severity: normal
status: open
title: Strange behavior under doctest: staticmethods have different __globals__
type: enhancement
Added file: http://bugs.python.org/file40422/t.py

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25049>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21234] __contains__ and friends should check is for all elements first

2014-04-16 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

Oops. That was a hard lesson: 1) don't haste when posting 2) always run what 
you post.

The point was the trick to define a custom __ne__ and not an __eq__ for an 
object (not for the container it is in!) so you can use in at full speed. Then
not all(map(ne, repeat(obj), container))
or
not all(map(obj.__ne__, container))
can be used if you really what to check for equality. This does make a 
difference in my case, where I only sometimes check for a non-identical object 
in the container, and I know when I do that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21234] __contains__ and friends should check is for all elements first

2014-04-15 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos:

It all started when adding an __equals__ method to a self made class.
The effect was that my program slowed down enormously, which came as a surprise.

This is because when doing an operation that does linear search through a 
container, the interpreter checks all items one by one, first checking identity 
and then equality.
If the __equals__ method is slow, this is slower than needed.

In python, you effectively get:
class myContainer:
  def __contains__(self, obj):
for i in range(len(self)):
  if self[i] is obj: return True
  if self[i]==obj: return True
return False

My proposal is to change this to:
class myContainer:
  def __contains__(self, obj):
for i in range(len(self)):
  if self[i] is obj: return True
for i in range(len(self)):
  if self[i]==obj: return True
return False

The net effect would be approximately:
- if the object is exactly in the container, the speedup is significant.
- if the object is not in the container, there is no difference.
- if an object is in the container that is equal to the object, it will be 
slower, but not very much.

In the normal use case, this will probably feel to the user as a speed 
improvement, especially when the __equals__ is slow.

I tried to find cases in which this would change behaviour of programs, but I 
couldn't find any. If this _does_ change behaviour in some noticeable and 
unwanted way, let me know! (Documenting would also be a good idea, then.)

The accompanying file gives some timings to show what I mean, e.g.
Time in us to find an exact match (begin, middle, end):
0.042335559340708886 31.610660936887758 62.69573781716389
Time in us to find an equal match (begin, middle, end):
0.3730294263293299 31.421928646805195 63.177373531221896
Time if not found:
63.44531546956001
And now for an object that has no __equal__:
Time in us to find a thing (begin, middle, end):
0.03555453901338268 9.878883646121661 19.656711762284473
Time if not found:
19.676395048315776

(Python 2.7 does something completely different with objects without __equal__, 
so the test gives quite different results.)

--
components: Interpreter Core
files: containsDemo.py
messages: 216286
nosy: jneb
priority: normal
severity: normal
status: open
title: __contains__ and friends should check is for all elements first
type: performance
versions: Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5
Added file: http://bugs.python.org/file34867/containsDemo.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21234] __contains__ and friends should check is for all elements first

2014-04-15 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

Well, I partially agree. I see the following points:
Against my proposal:
- For *very* big containers, it can be slower in the case the object is early 
in the container, as you pointed out.
- Current behaviour is easier to understand.

OTOH, fore the proposal:
- Such giant containers are not efficient anyway; that's were set/Counter can 
help.
- The documentation doesn't promise anywhere the objects are scanned in order.

Anyway, if this is supposed to be the behaviour, I suggest to document it, and 
add the following recipe for people dealing with the same problem as I had:

from operator import ne
from itertools import repeat
class MyContainer:
  container allowing equality search with containsEqual,
  while allowing fast identity search with __contains__:
  use obj in c to test if obj exactly sits in c
  use c.containsEqual(obj) to test if an object in c has c==obj
  
  def containsEqual(self, object):
return not all(map(ne, zip(repeat(object), self)))

  def __ne__(self, object):
Your not-equal test

If you see a more elegant equivalent recipe, feel free to add.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20853] pdb args crashes when an arg is not printable

2014-03-17 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

I did figure it out.
It almost works, except when a argument lost its value, and the same name 
exists in the global context.
To be more specific: I simplified do_args to the following code
(that obviously ugly by explicitly evaluating repr in context, but
that is not the point)
def do_args(self, arg):
a(rgs)
Print the argument list of the current function.
Modified by Jurjen

co = self.curframe.f_code
n = co.co_argcount
if co.co_flags  4: n = n+1
if co.co_flags  8: n = n+1
for i in range(n):
name = co.co_varnames[i]
expr = 'repr(%s)' % (name,)
self.message('%s = %s' % (name, self._getval_except(expr)))

At it works perfectly, except for this little surprise:
 bar = BAR
 def foo(bar):
...del bar
...return 5
 pdb.runcall(f, 10)
 stdin(2)f()
-del bar
(Pdb) a
bar = 5
(Pdb) n
 stdin(3)f()
- return 5
(Pdb) a
bar = 'BAR'  # Huh? Expected undefined

I'll leave it to the experts to patch this in proper Python coding style.
So, the conclusion is we need a way to safely evaluate the call to repr() in 
context, with self.curframe_locals[co.co_varnames[i]] as argument.

I did not find a good supporting routine for that elsewhere in pdb.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20853
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20959] print gives wrong error when printing *generator

2014-03-17 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos:

One of the more interesting ways to use print is printing output of a 
generator, as print(*generator()).
But if the generator generates a typeError, you get a very unhelpful error 
message:
 #the way it works OK
 def f(): yield 'a'+'b'
...
 print(*f())
ab
 #Now with a type error
 def f(): yield 'a'+5
...
 print(*f())
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: print() argument after * must be a sequence, not generator

The problem is twofold:
- the message is plainly wrong, since it does work with a generator
- the actual error is hidden from view

--
components: IO
messages: 213869
nosy: jneb
priority: normal
severity: normal
status: open
title: print gives wrong error when printing *generator
type: behavior
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20959
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20853] pdb args crashes when an arg is not printable

2014-03-14 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

Maybe we could use Pdb._getval_except(arg, frame=None) in the routine do_args.
If I understand the code, do_args does quite some work to get the value of name 
in the context of the current frame, maybe just calling
self._getval_except(name, frame=self.curframe)
plus or minus some code would do the job?
I guess the code would actually become shorter...
I'll try to figure it out.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20853
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20853] pdb args crashes when an arg is not printable

2014-03-12 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

Thanks for your reaction.
The object is not printable, since I was debugging an __init__ of an object, 
and some fields where being initialized:
class foo:
   def __init__(self):
  foo.bar = hello
   def repr(self): return foo.bar
I tried to make a useable patch file (with neater layout and using Exeption), 
see attach.

--
keywords: +patch
Added file: http://bugs.python.org/file34369/pdb.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20853
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20853] pdb args crashes when an arg is not printable

2014-03-12 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

Oops. Here the correct example:
 class foo:
...   def __init__(self):
... foo.bar = hello
...   def __repr__(self): return foo.bar
...
 pdb.runcall(foo)
 stdin(3)__init__()
(Pdb) a
Traceback (most recent call last):
  File .\pdb.py, line 1132, in do_args
self.message('%s = %r' % (name, dict[name]))
  File stdin, line 4, in __repr__
AttributeError: type object 'foo' has no attribute 'bar'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File stdin, line 1, in module
  File .\pdb.py, line 1580, in runcall
return Pdb().runcall(*args, **kwds)
  File C:\Python33\lib\bdb.py, line 439, in runcall
res = func(*args, **kwds)
  File stdin, line 3, in __init__
  File stdin, line 3, in __init__
  File C:\Python33\lib\bdb.py, line 47, in trace_dispatch
return self.dispatch_line(frame)
  File C:\Python33\lib\bdb.py, line 65, in dispatch_line
self.user_line(frame)
  File .\pdb.py, line 266, in user_line
self.interaction(frame, None)
  File .\pdb.py, line 345, in interaction
self._cmdloop()
  File .\pdb.py, line 318, in _cmdloop
self.cmdloop()
  File C:\Python33\lib\cmd.py, line 138, in cmdloop
stop = self.onecmd(line)
  File .\pdb.py, line 411, in onecmd
return cmd.Cmd.onecmd(self, line)
  File C:\Python33\lib\cmd.py, line 217, in onecmd
return func(arg)
  File .\pdb.py, line 1134, in do_args
self.message('%s = *** repr failed: %s ***' % (name,))
TypeError: not enough arguments for format string

At least, I expect pdb to not crash, but a clearer error (as in the patch) is 
nice to have.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20853
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20853] pdb args crashes when an arg is not printable

2014-03-12 Thread Jurjen N.E. Bos

Jurjen N.E. Bos added the comment:

I am not good at this. Sorry for the mess.
Here is a good example, and a good patch:
 class foo:
...   def __init__(self):
... foo.bar = hello
...   def __repr__(self): return foo.bar
...
 pdb.runcall(foo)
Traceback (most recent call last):
  File stdin, line 1, in module
NameError: name 'pdb' is not defined
 import pdb
 pdb.runcall(foo)
 stdin(3)__init__()
(Pdb) a
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Python33\lib\pdb.py, line 1577, in runcall
return Pdb().runcall(*args, **kwds)
  File C:\Python33\lib\bdb.py, line 439, in runcall
res = func(*args, **kwds)
  File stdin, line 3, in __init__
  File stdin, line 3, in __init__
  File C:\Python33\lib\bdb.py, line 47, in trace_dispatch
return self.dispatch_line(frame)
  File C:\Python33\lib\bdb.py, line 65, in dispatch_line
self.user_line(frame)
  File C:\Python33\lib\pdb.py, line 266, in user_line
self.interaction(frame, None)
  File C:\Python33\lib\pdb.py, line 345, in interaction
self._cmdloop()
  File C:\Python33\lib\pdb.py, line 318, in _cmdloop
self.cmdloop()
  File C:\Python33\lib\cmd.py, line 138, in cmdloop
stop = self.onecmd(line)
  File C:\Python33\lib\pdb.py, line 411, in onecmd
return cmd.Cmd.onecmd(self, line)
  File C:\Python33\lib\cmd.py, line 217, in onecmd
return func(arg)
  File C:\Python33\lib\pdb.py, line 1131, in do_args
self.message('%s = %r' % (name, dict[name]))
  File stdin, line 4, in __repr__
AttributeError: type object 'foo' has no attribute 'bar'

--
Added file: http://bugs.python.org/file34370/pdb.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20853
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20853] pdb args crashes when an arg is not printable

2014-03-05 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos:

The args command in pdb crashes when an argument cannot be printed.
Fortunately, this is easy to fix.

For version 3.3.3:
In function Pdb.do_args (lib/pdb.py, line 1120)
Change line 1131
  self.message('%s = %r' % (name, dict[name]))
to
  try: r = repr(dict[name])
  except: r = (Cannot print object)
  self.message('%s = %s' % (name, r))

--
components: Library (Lib)
messages: 212759
nosy: jneb
priority: normal
severity: normal
status: open
title: pdb args crashes when an arg is not printable
type: enhancement
versions: Python 2.7, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20853
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19993] Pool.imap doesn't work as advertised

2013-12-16 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos:

The pool.imap and pool.imap_unordered functions are documented as a lazy 
version of Pool.map.
In fact, they aren't: they consume the iterator argument as a whole. This is 
almost certainly not what the user wants: it uses unnecessary memory and will 
be slower than expected if the output iterator isn't consumed in full. In fact, 
there isn't much use at all of imap over map at the moment.
I tried to fixed the code myself, but due to the two-level queueing of the 
input arguments this is not trivial.
Stackoverflow's Blckknght wrote a simplified solution that gives the idea how 
it should work.
Since that wasn't posted here, I thought it would be useful to put it here, 
even if only for documentation purposes.

--
components: Library (Lib)
files: mypool.py
messages: 206279
nosy: jneb
priority: normal
severity: normal
status: open
title: Pool.imap doesn't work as advertised
type: behavior
Added file: http://bugs.python.org/file33158/mypool.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19993
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11087] Speeding up the interpreter with a few lines of code

2011-01-31 Thread Jurjen N.E. Bos

New submission from Jurjen N.E. Bos j...@users.sourceforge.net:

I found a very simple way to improve the speed of CPython a few percent on the 
most common platforms (i.e. x86), at the cost of only a few lines of code in 
ceval.c
The only problem is that I don't have any experience in patch submission.

Here are the suggested new lines (also see submitted file):

#define NEXTARG() (next_instr +=2, *(unsigned short*)next_instr[-2])
#define PEEKARG() (*(unsigned short*)next_instr[1])

of course this code only works on little-endian processors that allow 
nonaligned shorts; a change to configure might be needed (*shiver*).

Hope you like it.

--
components: Interpreter Core
files: speedpatch.c
messages: 127686
nosy: jneb
priority: normal
severity: normal
status: open
title: Speeding up the interpreter with a few lines of code
type: performance
versions: Python 2.5, Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3
Added file: http://bugs.python.org/file20638/speedpatch.c

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11087
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com