Re: [Python-Dev] py2.7: dictobject not properly resizing

2013-03-31 Thread Micha Gorelick
So I did a bit of benchmarking and attached is the code I used.  With
downsizing happening when ma_used * 2 = ma_filled, or the following
for the condition in question:

if (mp-ma_used = n_used || (mp-ma_fill*3  (mp-ma_mask+1)*2 
mp-ma_used*2  mp-ma_fill))

I see marginally faster performance with downsizing.  I chose a factor
of 2x because it will ensure downsizings before the ma_fill load
factor comes into play and will also not cause a resize on the next
insert.  Using the vanilla python2.7.3 code, there are never any
resize events where the new size is smaller than the old size.

Cheers,
Micha
-
http://micha.gd/
http://github.com/mynameisfiber/


test.py
Description: Binary data
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] py2.7: dictobject not properly resizing

2013-03-30 Thread Antoine Pitrou
On Sat, 30 Mar 2013 17:31:26 -0400
Micha Gorelick mynameisfi...@gmail.com wrote:
 I was taking a look at dictobject.c and realized that the logic
 controlling whether a resizedict will occur in
 dict_set_item_by_hash_or_entry disallows for the shrinking of a
 dictionary.  This is contrary to what the comments directly above say:

Also in 3.4:

 d = {i: i for i in range(1000)}
 sys.getsizeof(d)
49264
 for i in range(900): del d[i]
... 
 sys.getsizeof(d)
49264
 for i in range(900, 1000): del d[i]
... 
 sys.getsizeof(d)
49264
 len(d)
0
 d.clear()
 sys.getsizeof(d)
88


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] py2.7: dictobject not properly resizing

2013-03-30 Thread Armin Rigo
Hi Antoine,

On Sat, Mar 30, 2013 at 10:37 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Sat, 30 Mar 2013 17:31:26 -0400
 Micha Gorelick mynameisfi...@gmail.com wrote:
 I was taking a look at dictobject.c and realized that the logic
 controlling whether a resizedict will occur in
 dict_set_item_by_hash_or_entry disallows for the shrinking of a
 dictionary.

It doesn't disallow shrinking.  If you take a dictionary of size 1000,
remove of its elements, and continue to use it (write and delete more
items) for long enough, then eventually it is shrinked.  It just takes
a while because it needs to fill 2/3 of the slots of the big table
with deleted markers before it happens.

Python 3.3.0b1 (default:07ddf5ecaafa, Aug 12 2012, 17:47:28)
[GCC 3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.10)] on linux
Type help, copyright, credits or license for more information.
 d = {i:i for i in range(1000)}
 sys.getsizeof(d)
24624
 for i in range(1000): del d[i]
...
 sys.getsizeof(d)
24624
 for j in range(1001, 2000):
...d[j] = 0; del d[j]
...
 sys.getsizeof(d)
144



A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] py2.7: dictobject not properly resizing

2013-03-30 Thread Micha Gorelick
True, but your example mechanism of getting a shrink event is purely
based on ma_fill.  This is happening because your last loop is
increasing ma_fill to the point where it thinks it needs to resize
because of the load factor and then it calculates the new size based
on ma_used.  The comment that I pointed out from the code seems to
imply that simply having ma_fill  ma_used will trigger a resize.
The two conditions for a resize are definitely not equivalent!

Micha
-
http://micha.gd/
http://github.com/mynameisfiber/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com