[issue13903] New shared-keys dictionary implementation

2012-05-12 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 10e8b97d0fd7 by Antoine Pitrou in branch 'default':
Make the reference counting of dictkeys objects participate in refleak hunting
http://hg.python.org/cpython/rev/10e8b97d0fd7

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-05-02 Thread Benjamin Peterson

Changes by Benjamin Peterson benja...@python.org:


--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-30 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Change insertdict to follow normal (non-stealing) ref-counting behaviour which 
fixes possible leakage.

Patch attached.

--
Added file: http://bugs.python.org/file25422/insertdict.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-30 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset c5fd332e5857 by Benjamin Peterson in branch 'default':
change insertdict to not steal references (#13903)
http://hg.python.org/cpython/rev/c5fd332e5857

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-27 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Decref cached-keys when type is deallocated.

Patch attached.

--
Added file: http://bugs.python.org/file25381/cached_keys.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-27 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset a3beae842f13 by Benjamin Peterson in branch 'default':
decref cached keys on type deallocation (#13903)
http://hg.python.org/cpython/rev/a3beae842f13

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-27 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 New changeset a3beae842f13 by Benjamin Peterson in branch 'default':
 decref cached keys on type deallocation (#13903)
 http://hg.python.org/cpython/rev/a3beae842f13

Is there any way to detect / avoid leaks on this separate refcounting
scheme?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-27 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

This patch integrates the dictkeys' refcounting into the refcount checking 
framework. Seems to work ok, but it would be better if someone more acquainted 
with the code could confirm it.

--
Added file: http://bugs.python.org/file25385/dkdebug.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-24 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Failing to maintain GC tracking in setdefault and copy (for split-tables)

Patch attached

--
Added file: http://bugs.python.org/file25339/gc_tracking.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-24 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 507a6703d6a3 by Benjamin Peterson in branch 'default':
fix dict gc tracking (#13903)
http://hg.python.org/cpython/rev/507a6703d6a3

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-24 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Failed to differentiate between failure and error in make_split_table().

Patch attached

--
Added file: http://bugs.python.org/file25340/make_split_table_error.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-24 Thread Giampaolo Rodola'

Changes by Giampaolo Rodola' g.rod...@gmail.com:


--
nosy:  -giampaolo.rodola

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-24 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset b044e0568be2 by Martin v. Loewis in branch 'default':
Account for shared keys in type's __sizeof__ (#13903).
http://hg.python.org/cpython/rev/b044e0568be2

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-24 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 5d5b72a71898 by Benjamin Peterson in branch 'default':
distiguish between refusing to creating shared keys and error (#13903)
http://hg.python.org/cpython/rev/5d5b72a71898

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-23 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file25313/73423916a242.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-23 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

I've updated the repository and uploaded a new patch in response to Benjamin's 
review. (And the contributor form is in the post).

One remaining issue is the return value of __sizeof__().
If it is an int, then it cannot accurately reflect the memory use,
but returning a float may seem rather surprising.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-23 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 6e5855854a2e by Benjamin Peterson in branch 'default':
Implement PEP 412: Key-sharing dictionaries (closes #13903)
http://hg.python.org/cpython/rev/6e5855854a2e

--
nosy: +python-dev
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-23 Thread Benjamin Peterson

Benjamin Peterson benja...@python.org added the comment:

Okay. I committed the latest patch. Subtleties like __sizeof__ can be worked 
out as people start using it.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com




[issue13903] New shared-keys dictionary implementation

2012-04-23 Thread Yury Selivanov

Yury Selivanov yseliva...@gmail.com added the comment:

Mark, did you add the test that your patch initially was failing with? 
http://mail.python.org/pipermail/python-dev/2012-February/116605.html

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-23 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

I fixed it back then, but didn't add the test.
It subsequently regressed.
Should know better.

Patch (with test this time) attached.

--
resolution: fixed - 
status: closed - open
Added file: http://bugs.python.org/file25318/str_subclass.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-23 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 34b6998efd2c by Benjamin Peterson in branch 'default':
fix instance dicts with str subclasses (#13903)
http://hg.python.org/cpython/rev/34b6998efd2c

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-19 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

Mark, can you please submit a contributor form?

--
nosy: +loewis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-12 Thread Yury Selivanov

Changes by Yury Selivanov yseliva...@gmail.com:


--
nosy: +Yury.Selivanov

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-12 Thread Raymond Hettinger

Raymond Hettinger raymond.hettin...@gmail.com added the comment:

Moving the assigned to to nobody.  I won't have a chance for a thorough 
review for another ten days.  Hopefully, someone else will have a chance to 
review it before then.

--
assignee: rhettinger - 

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-11 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

I don't really understand your objection to changing the method-cache size. As 
I said, I can remove the change, but that will cause the performance regression 
that Antoine noticed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-06 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

How about changing the method-cache size in a separate patch?

On my machine, reducing the method-cache size from 2**10 to 2**9 results
in a very small improvement in performance (with the original dict). 

That way we don't get a performance regression with the new dict  
and the patch is better focussed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-04 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 I'm not bothered by the regression in silent_logging,
 as it is a micro benchmark with a very short running time.

I'm not concerned about the micro-benchmark itself but the fact that it
might hint at a wider problem.
Also, I don't get your remark about it running in a short time. Your
patch AFAICT doesn't need any warm up period to exhibit any
improvements.

 Reducing the method-cache size from 2**10 to 2**9 allows the working
 set to fit better in the cache.
 This fixes the regression in mako, but makes OO programs that use
 few objects (such as richards) a bit slower.

I don't think we should reduce the size of the method cache. 1024 is not
a very large number, and might even be too small for complex OO
programs.

I also think that, apart from the dict storage changes, your patch
should strive not to change any other tunables. Otherwise we're really
comparing apples to oranges.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-04 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Antoine Pitrou wrote:
 Antoine Pitrou pit...@free.fr added the comment:
 
 I'm not bothered by the regression in silent_logging,
 as it is a micro benchmark with a very short running time.
 
 I'm not concerned about the micro-benchmark itself but the fact that it
 might hint at a wider problem.

Or it might not.
Micro-benchmarks produce micro-optimisations.
That's why I dislike them.

 Also, I don't get your remark about it running in a short time. Your
 patch AFAICT doesn't need any warm up period to exhibit any
 improvements.

What I mean is that the runtime is so short, no one would notice any
change, so who cares?

 
 Reducing the method-cache size from 2**10 to 2**9 allows the working
 set to fit better in the cache.
 This fixes the regression in mako, but makes OO programs that use
 few objects (such as richards) a bit slower.
 
 I don't think we should reduce the size of the method cache. 1024 is not
 a very large number, and might even be too small for complex OO
 programs.

not very large or too small, by what metric?

The optimum size (for speed) of the method cache depends on the size of 
hardware data cache, the complexity of the program, and many other factors.
Attempt to reason about it are pretty much futile.
Empiricism is the only way to go.

 
 I also think that, apart from the dict storage changes, your patch
 should strive not to change any other tunables. Otherwise we're really
 comparing apples to oranges.

If the implementation changes, shouldn't the tunable parameters be retuned?

Cheers,
Mark.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-04 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  Also, I don't get your remark about it running in a short time. Your
  patch AFAICT doesn't need any warm up period to exhibit any
  improvements.
 
 What I mean is that the runtime is so short, no one would notice any
 change, so who cares?

None of the benchmarks used here are real-world workloads, so you might
as well claim that they are all irrelevant. But then we'll have a hard
time assessing the consequences of your patch.

  I don't think we should reduce the size of the method cache. 1024 is not
  a very large number, and might even be too small for complex OO
  programs.
 
 not very large or too small, by what metric?

By the metric of the number of classes and methods in a complex OO
application (for example something based on Twisted or SQLAlchemy).

  I also think that, apart from the dict storage changes, your patch
  should strive not to change any other tunables. Otherwise we're really
  comparing apples to oranges.
 
 If the implementation changes, shouldn't the tunable parameters be retuned?

Only if there's a reasoning behind it. Perhaps the retuning would have
given the same results without the rest of your patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-04 Thread Raymond Hettinger

Raymond Hettinger raymond.hettin...@gmail.com added the comment:

[Antoine]
I also think that, apart from the dict storage changes, 
 your patch should strive not to change any other tunables. 

I agree.  Please keep the patch focused on the single task, the shared keys.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-02 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file25096/372d0bca85ae.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-04-02 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

I'm not bothered by the regression in silent_logging,
as it is a micro benchmark with a very short running time.

The regression in mako is, I think, caused by competition for the
data cache between the new dict implementation and the method-cache
used by _PyType_Lookup. Although the new-dict uses less memory overall,
the size of a single split-table dict (keys + value) is larger than a combined 
table.

Reducing the method-cache size from 2**10 to 2**9 allows the working set to fit 
better in the cache.
This fixes the regression in mako, but makes OO programs that use few objects 
(such as richards) a bit slower.
Compared with tip, the new-dict implementation
is 4% slower, using 7% less memory for mako. 6% slower, using 5% less memory 
for richards.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-03-16 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

The latest patch has a significant (negative) effect on some benchmarks:

### silent_logging ###
Min: 0.057927 - 0.068228: 1.18x slower
Avg: 0.058218 - 0.068660: 1.18x slower
Significant (t=-36.06)

### mako ###
Min: 0.118240 - 0.140451: 1.19x slower
Avg: 0.120145 - 0.142422: 1.19x slower
Significant (t=-61.66)

These regressions should be fixed before going any further, IMO.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-03-09 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24766/257e16e71654.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-03-09 Thread Jim Jewett

Jim Jewett jimjjew...@gmail.com added the comment:

 As of Feb 28, 2012, the PEP mentions an additional
 optimization of storing the values in an array indexed
 by (effectively) key insertion order, rather than key
 position. (Alternative Implementation)

 It states that this would reduce memory usage for the
 values array by 1/3.  1/3 is a worst-case measurement;
 average is 1/2.  (At savings of less than 1/3, the keys
 would resize, to initial savings of 2/3.  And yes, that
 means in practice, the average savings would be greater
 than half, because the frequency of dicts of size N
 decreases with N.)

 I presume you mean to allocate a values array of
 size == actual keys rather than size == usable keys.

Actually, I meant size==maximum(keys in use for this dict), 
so that for objects with a potentially complex lifecycle, 
those instances that had not yet needed the new attributes
would not need to allocate space for them.  

But I see now that just allocating space for each of the 
potential keys is indeed simpler.  And I suppose that a
class which won't eventually need all the attributes for 
every instance is an unusual case. 

So to get beneath 2/3 without lots of reallocation 
probably requires knowing when the key set is likely
to be complete, and that is indeed more complex than
the current changes.  (That said, you have left code
in for immutable keys, so there may be cases where 
this isn't so hard after all.)

 Making the  values array smaller than the the number 
 of usable keys causes a number of issues:

 1. The values_size will need to be kept in the dict,
 not in the keys, 

That was indeed true for my proposal.  If you just allocate
2/3, then you don't need to store the value, unless you 
want to be lazy about reallocating when the keys grow. 
Even then, there are few enough potential sizes to fit
the information in a byte, or we could get the info for
free *if* the dict were already timestamped or versioned.

 It states that the keys table would need an additional
 values_size field, but in the absence of dummies, this
 is just ma_used.

I was mixing two structures in my mind.  The current key 
count (which is sufficient for a new values array) is 
actually USABLE_FRACTION(dk_size) - dk_free.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-03-09 Thread Jim Jewett

Jim Jewett jimjjew...@gmail.com added the comment:

On Fri, Mar 9, 2012 at 12:13 PM, Jim Jewett

 So to get beneath 2/3 without lots of reallocation
 probably requires knowing when the key set is likely
 to be complete, and that is indeed more complex than
 the current changes.  (That said, you have left code
 in for immutable keys, so there may be cases where
 this isn't so hard after all.)

On second thought, avoiding reallocation doesn't have
to be perfect -- just good enough to work on average.

For a *normal* class, the keyset won't change after
the first instance has completed its __init__.

Which of course again leads to autoslots and a
normally NULL extra_dict.  And having done that,
it makes sense to combine the (normal) instance
dict with the type dict to simplify the lookup chain,
but ... that is probably too aggressive for the 3.3
schedule.  One silver lining to your patch plus hash
randomization is that that 3.4 should have a
pretty free hand with regards to internal details.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-29 Thread Jim Jewett

Jim Jewett jimjjew...@gmail.com added the comment:

As of Feb 28, 2012, the PEP mentions an additional optimization of storing the 
values in an array indexed by (effectively) key insertion order, rather than 
key position. (Alternative Implementation)

It states that this would reduce memory usage for the values array by 1/3.  1/3 
is a worst-case measurement; average is 1/2.  (At savings of less than 1/3, the 
keys would resize, to initial savings of 2/3.  And yes, that means in practice, 
the average savings would be greater than half, because the frequency of dicts 
of size N decreases with N.)

It states that the keys table would need an additional values_size field, but 
in the absence of dummies, this is just ma_used.

Note that this would also simplify resizing, as the values arrays would not 
have to be re-ordered, and would not have to be modified at all unless/until 
that particular instance received a value for a position beyond its current 
size.

--
nosy: +Jim.Jewett

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-29 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Jim Jewett wrote:
 Jim Jewett jimjjew...@gmail.com added the comment:
 
 As of Feb 28, 2012, the PEP mentions an additional optimization of storing 
 the values in an array indexed by (effectively) key insertion order, rather 
 than key position. (Alternative Implementation)
 
 It states that this would reduce memory usage for the values array by 1/3.  
 1/3 is a worst-case measurement; average is 1/2.  (At savings of less than 
 1/3, the keys would resize, to initial savings of 2/3.  And yes, that means 
 in practice, the average savings would be greater than half, because the 
 frequency of dicts of size N decreases with N.)
 

I presume you mean to allocate a values array of size == actual keys
rather than size == usable keys.

Making the  values array smaller than the the number of usable keys
causes a number of issues:

1. The values_size will need to be kept in the dict, not in the keys, 
which would make the dict larger for size 3 or 5, the same size for
size 2 or 4, but it would save memory for sizes 6-8 (see below).
So it might not save memory at all, it depends on the distribution of 
the sizes of shared-key dicts.
2. dk_usable would need to be reverted to dk_fill (c.f ma_fill) in order
to quickly allocate values arrays. This would slow down the resizing 
check, which is now done before insertion, so might be slower.
(not really a problem, but it might slow down inserts)
3. An extra check needs to be performed for each read to ensure the 
index is in-bounds
(again not really a problem, but it might slow down reads)

Comparative sizes of values array (including size field):

Items ProposedAlternativeOther AlternativeDelta
(size == usuable) (size == keys (+1))
   1  4   3 2  -1   
   2  4   3 3   0
   3  4   3 4  +1
   4  8   5 5   0
   5  8   5 6  +1
   6 16  10 7  -3
   7 16  10 8  -2
   8 16  10 9  -1
   9 16  1010   0
  10 16  1011  +1
  12 32  2113  -8
  14 32  2115  -6

The memory savings of the two alternative implementations are broadly
similar.

Clearly, either of the alternatives are attractive in terms of memory 
use. I think it is definitely worth investigating further, but I would 
like to get the proposed implementation into CPython first.

 It states that the keys table would need an additional values_size field, 
 but in the absence of dummies, this is just ma_used.

values_size != ma_used
values_size is the size of the values array, not the number of values.

Don't forget deletions or the case where items are inserted in a 
different order from that of another dict sharing the same keys.
Although there are no dummies, (key != NULL, value == NULL) is a legal
pair representing a missing or deleted value.

Cheers,
Mark.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-28 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24670/49b7e7e4a27c.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-13 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24510/691ce331f955.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-13 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Removed file: http://bugs.python.org/file24465/e50db1b7ad7b.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-09 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24465/e50db1b7ad7b.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-09 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Removed file: http://bugs.python.org/file24456/20702d1acf17.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


--
hgrepos: +112

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24453/a9138aba7896.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Removed file: http://bugs.python.org/file24360/6a21f3b35e20.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Removed file: http://bugs.python.org/file24453/a9138aba7896.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24454/1f703b2607af.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Looking at your latest patch, I worry about any deletion
+(including pop  popitem) causes a split table to become a combined table. 
Why wouldn't you use a dummy pointer (such as ((PyObject *) 1)) to signal 
deleted slots?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Antoine Pitrou wrote:
 Antoine Pitrou pit...@free.fr added the comment:
 
 Looking at your latest patch, I worry about any deletion
 +(including pop  popitem) causes a split table to become a combined table. 
 Why wouldn't you use a dummy pointer (such as ((PyObject *) 1)) to signal 
 deleted slots?

In fact here is no need for a dummy pointer.
When deleting from a split-table, the value can just be set to NULL,
the resulting key-NULL pair is legal in a split-table.
Your suggestion doesn't make the code any more complex, so I've included it.

In practice, it will very rare that a deletion occurs in a split table
(since they are only used for attribute dictionaries).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Removed file: http://bugs.python.org/file24454/1f703b2607af.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24455/bc286099ce9a.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Removed file: http://bugs.python.org/file24455/bc286099ce9a.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-08 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24456/20702d1acf17.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-02-06 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


--
hgrepos:  -109

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-31 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Raymond Hettinger wrote:
 Raymond Hettinger raymond.hettin...@gmail.com added the comment:
 
 Changing dictionaries is a big deal.  You're changing many pieces at once 
 (not a good idea) including changing tunable parameters that are well-studied 
 (I spent a month testing whether 5/8 was better idea that 2/3 for resizing or 
 when the ideal small dict size was 4, 8, or 16).  You're changing the meaning 
 of the fields in dictobject.h which will likely break any code that relied on 
 those.
 
 The ideas may be good ones but they warrant a good deal of thought.  Dicts 
 weren't just slapped together -- the current code is the product to two 
 decades of tweaking by engineers who devoted significant time to the task.  
 It would be easy to unknowingly undo some of their work.
 

OK.
I'll write a PEP.

By the way, I'm trying not changing the tunable parameters for the 
unshared-keys case, just the shared-keys case. Of course, they do 
interact with each other.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-31 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

As Victor I think the tunables could be changed separately, unless they truely 
get in the way of the shared-keys optimization.

By the way, there will need a protection for the case where instances are 
used as bags of key/value pairs (dict-alikes with attribute access). Perhaps 
bound the size of the keys array to 100 entries and then switch into unshared 
mode.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-31 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-31 Thread Gregory P. Smith

Gregory P. Smith g...@krypto.org added the comment:

FYI - I strongly support this type of work to reduce memory use of the Python 
interpreter!  :)

Also, yes, constant changing should be separate from this change but are worth 
occasionally re-measuring and justifying as common computer architectures have 
changed since the original decisions were made.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-30 Thread Raymond Hettinger

Raymond Hettinger raymond.hettin...@gmail.com added the comment:

This would likely require a PEP before having a chance of being considered for 
inclusion.

--
nosy: +rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-30 Thread Mark Shannon

Mark Shannon m...@hotpy.org added the comment:

Does this really need a PEP?
There is no new language feature and no change to any API.
It is just saving some memory.

The only possible issue is changing dict repr() ordering, 
but issue 13703 will do that anyway.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-30 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


--
assignee:  - rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-30 Thread Raymond Hettinger

Raymond Hettinger raymond.hettin...@gmail.com added the comment:

Changing dictionaries is a big deal.  You're changing many pieces at once (not 
a good idea) including changing tunable parameters that are well-studied (I 
spent a month testing whether 5/8 was better idea that 2/3 for resizing or when 
the ideal small dict size was 4, 8, or 16).  You're changing the meaning of the 
fields in dictobject.h which will likely break any code that relied on those.

The ideas may be good ones but they warrant a good deal of thought.  Dicts 
weren't just slapped together -- the current code is the product to two decades 
of tweaking by engineers who devoted significant time to the task.  It would be 
easy to unknowingly undo some of their work.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Mark Shannon

New submission from Mark Shannon m...@hotpy.org:

The proposed dictionary implementation allows sharing of keys  hashes between 
dictionaries. This leads to substantial memory savings for object-oriented 
programs. For non-OO programs the impact is negligible.

--
components: Interpreter Core
hgrepos: 109
messages: 152229
nosy: Mark.Shannon
priority: normal
severity: normal
status: open
title: New shared-keys dictionary implementation
type: performance
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Georg Brandl

Changes by Georg Brandl ge...@python.org:


--
keywords: +patch
Added file: http://bugs.python.org/file24357/061f8573af54.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Mark Shannon

Changes by Mark Shannon m...@hotpy.org:


Added file: http://bugs.python.org/file24360/6a21f3b35e20.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Georg Brandl

Changes by Georg Brandl ge...@python.org:


Removed file: http://bugs.python.org/file24357/061f8573af54.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +benjamin.peterson, pitrou
stage:  - patch review
versions: +Python 3.3 -Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

In the initial comment, 'Dummy' to 'Deleted' here but only here:
-   Holds an active (key, value) pair.  Active can transition to Dummy 
+   Holds an active (key, value) pair.  Active can transition to Deleted 

Im Lib/test/test_pprint.py
 def test_set_reprs(self): ...
 # Consequently, this test is fragile and ...
+# XXX So why include this test in the first place?
Raymond, I believe you added this 44927 and revised for 3.x in 45067.
I imagine it will also be a problem with randomized hashes. Should it be 
removed or somehow revised?

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Giampaolo Rodola'

Changes by Giampaolo Rodola' g.rod...@gmail.com:


--
nosy: +giampaolo.rodola

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Philip Jenvey

Changes by Philip Jenvey pjen...@underboss.org:


--
nosy: +pjenvey

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread Gregory P. Smith

Changes by Gregory P. Smith g...@krypto.org:


--
nosy: +gregory.p.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

I see two unrelated parts in your patch:
 - change dictionary structure in memory
 - change many constants linked to optimization: PyDICT_MAXFREELIST: 80-40, 
2/3-5/8, etc.

You may open a new issue for the second part, except if I am wrong and you need 
to change constants for the first part?

(I don't understand why you changed constants and how you chose new values.)

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13903] New shared-keys dictionary implementation

2012-01-29 Thread John O'Connor

Changes by John O'Connor tehj...@gmail.com:


--
nosy: +jcon

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13903
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com