[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-04-01 Thread INADA Naoki

INADA Naoki added the comment:


New changeset efde51ad54c58353f25ff80c8d30dbee82ef33a3 by INADA Naoki in branch 
'3.6':
bpo-29949: Fix set memory usage regression (GH-945)
https://github.com/python/cpython/commit/efde51ad54c58353f25ff80c8d30dbee82ef33a3


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-04-01 Thread INADA Naoki

Changes by INADA Naoki :


--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-04-01 Thread INADA Naoki

Changes by INADA Naoki :


--
pull_requests: +1127

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-04-01 Thread INADA Naoki

INADA Naoki added the comment:


New changeset e82cf8675bacd7a03de508ed11865fc2701dcef5 by INADA Naoki in branch 
'master':
bpo-29949: Fix set memory usage regression (GH-943)
https://github.com/python/cpython/commit/e82cf8675bacd7a03de508ed11865fc2701dcef5


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-04-01 Thread INADA Naoki

Changes by INADA Naoki :


--
pull_requests: +1125

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread Cameron Simpson

Changes by Cameron Simpson :


--
nosy: +cameron

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Do you want to prepare a PR for me?  I not yet set-up with the ways of Github.  
Please limit the PR to just unwinding the refactoring in the simplest possible 
way..

If in the future you want to chat about speed/space trade-offs, we can do that 
offline.  I've spent years thinking about this, interacting with users, 
discussing with other devs, speaking on the topic, and working through use 
cases for sets.  The original reasons for the choices made largely are still 
true today.  I would be happy to walk you through the history (the tracker 
isn't a good place to do this, IRC would serve us better).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread INADA Naoki

INADA Naoki added the comment:

I agree.  Before thinking about rebalance between size and speed,
resolving regression is important.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I think the best thing to do is to undo the refactoring in 
https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6
 .   It is was intended be neutral but did affect set_update_internal() for 
small sets.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread INADA Naoki

INADA Naoki added the comment:

> frozenset(range(n)) is slightly larger in 3.7 than in 3.6. It is 4 times 
> larger for about 10% of sizes.

This is intensional:
https://github.com/python/cpython/commit/5cd87a8d61246b0a6233bfb8503d4718b693cef0

load factor is reduced from 66% to 60%. (-10%)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

$ ./python -c 'N = 6000; from sys import getsizeof; s = 
[getsizeof(frozenset(range(n))) for n in range(N)]; print( [(n, s[n]) for n in 
range(N) if not n or s[n] != s[n-1]] )'

3.5:  [(0, 112), (6, 368), (22, 1136), (86, 4208), (342, 16496), (1366, 65648), 
(5462, 262256)]
3.6:  [(0, 112), (5, 368), (21, 1136), (85, 4208), (341, 16496), (1365, 65648), 
(5461, 262256)]
3.7:  [(0, 112), (5, 368), (19, 1136), (77, 4208), (307, 16496), (1229, 65648), 
(4915, 262256)]

$ ./python -c 'N = 6000; from sys import getsizeof; s = 
[getsizeof(frozenset(set(range(n for n in range(N)]; print( [(n, s[n]) for 
n in range(N) if not n or s[n] != s[n-1]] )'

3.5:  [(0, 112), (6, 240), (8, 368), (16, 624), (32, 1136), (64, 2160), (128, 
4208), (256, 8304), (512, 16496), (1024, 32880), (2048, 65648), (4096, 131184)]
3.6:  [(0, 112), (5, 368), (8, 624), (16, 1136), (32, 2160), (64, 4208), (128, 
8304), (256, 16496), (512, 32880), (1024, 65648), (2048, 131184), (4096, 
262256)]
3.7:  [(0, 112), (5, 368), (8, 624), (16, 1136), (32, 2160), (64, 4208), (128, 
8304), (256, 16496), (512, 32880), (1024, 65648), (2048, 131184), (4096, 
262256)]

frozenset(range(n)) is slightly larger in 3.7 than in 3.6. It is 4 times larger 
for about 10% of sizes.

frozenset(set(range(n))) is 2 times larger in 3.6 than in 3.5 for all sizes >= 
16.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread INADA Naoki

INADA Naoki added the comment:

See set_update_internal().
https://github.com/python/cpython/blob/master/Objects/setobject.c#L969-L1016

This happens only when iterable is set or dict.

>>> import sys
>>> sys.getsizeof(set(range(10)))
736
>>> sys.getsizeof(set(set(range(10
1248
>>> sys.getsizeof(set(dict.fromkeys(range(10
1248

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Hmm, I wonder why I'm not seeing the same sizes you are seeing.

$ cat setsize.py
from sys import getsizeof
print( [getsizeof(frozenset(range(n))) for n in range(20)] )
$ python3.4 setsize.py
[224, 224, 224, 224, 224, 224, 736, 736, 736, 736, 736, 736, 736, 736, 736, 
736, 736, 736, 736, 736]
$ python3.5 setsize.py
[224, 224, 224, 224, 224, 224, 736, 736, 736, 736, 736, 736, 736, 736, 736, 
736, 736, 736, 736, 736]
$ python3.6 setsize.py
[224, 224, 224, 224, 224, 736, 736, 736, 736, 736, 736, 736, 736, 736, 736, 
736, 736, 736, 736, 736]
$ python3.4 --version
Python 3.4.4
$ python3.5 --version
Python 3.5.3
$ python3.6 --version
Python 3.6.1

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I'll look at this over the next week or two.  I don't really like the proposed 
patch at all but will carefully think through the speed/space trade-offs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-31 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
assignee:  -> rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-30 Thread INADA Naoki

INADA Naoki added the comment:

https://gist.github.com/methane/8faf12621cdb2166019bbcee65987e99
This patch fixes the regression.

But I think it is still larger than idiomatic.  See this code:

code:
minused *= 2;
/* Find the smallest table size > minused. */
/* XXX speed-up with intrinsics */
for (newsize = PySet_MINSIZE;
 newsize <= minused && newsize > 0;
 newsize <<= 1)
;

When original minused is X, newsize will be about 2X ~ 4X.

For set.add(), preserving extra space for further add() make sense.
But for set_merge(), intention is avoiding repeated resize while merging.
There may be not "further add()", especially for `frozenset(s)`.
So 30% ~ 60% seems better than 25% ~ 50%.

How do you think, Raymond?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-30 Thread Jan Gosmann

Changes by Jan Gosmann :


--
nosy: +jgosmann

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-30 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29949] sizeof set after set_merge() is doubled from 3.5

2017-03-30 Thread INADA Naoki

New submission from INADA Naoki:

(original thread is 
https://mail.python.org/pipermail/python-list/2017-March/720391.html)

https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6

this commit doubles sizeof set object created by set_merge().
It is used by constructor of set and frozenset.

$ /usr/bin/python3
Python 3.5.2+ (default, Sep 22 2016, 12:18:14)
[GCC 6.2.0 20160927] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> s = set(range(10))
>>> sys.getsizeof(frozenset(s))
736

$ python3
Python 3.6.0 (default, Dec 30 2016, 20:49:54)
[GCC 6.2.0 20161005] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import  sys
>>> s = set(range(10))
>>> sys.getsizeof(frozenset(s))
1248

--
components: Interpreter Core
keywords: 3.6regression
messages: 290868
nosy: inada.naoki, rhettinger
priority: normal
severity: normal
status: open
title: sizeof set after set_merge() is doubled from 3.5
versions: Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com