> On Nov 4, 2017, at 7:04 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> 
> When I asked Damien George about this for MicroPython, he indicated
> that they'd have to choose between guaranteed order and O(1) lookups
> given their current dict implementation. That surprised me a bit
> (since PyPy and CPython both *saved* memory by switching to their
> guaranteed order implementations, hence the name "compact dict
> representation"), but my (admittedly vague) understand is that the
> presence of a space/speed trade-off in their case has something to do
> with MicroPython deliberately running with a much higher chance of
> hash collisions in general (since the data sets it deals with are
> naturally smaller).
> 
> So if we make the change, MicroPython will likely go along with it,
> but it may mean that dict lookups there become O(N), and folks will be
> relying on "N" being consistently small due to memory constraints (but
> some typically O(N) algorithms will still become O(N^2) when run on
> MicroPython).
> 
> I don't think that situation should change the decision, but I do
> think it would be helpful if folks that understand CPython's dict
> implementation could take a look at MicroPython's dict implementation
> and see if it might be possible for them to avoid having to make that
> trade-off and instead be able to use a naturally insertion ordered
> hashmap implementation.

I've just looked at the MicroPython dictionary implementation and think they 
won't have a problem implementing O(1) compact dicts with ordering.

The likely reason for the confusion is that they are already have an option for 
an "ordered array" dict variant that does a brute-force linear search.  
However, their normal hashed lookup is very similar to ours and is easily 
amenable to being compact and ordered.

See:  
https://github.com/micropython/micropython/blob/77a48e8cd493c0b0e0ca2d2ad58a110a23c6a232/py/map.c#L139

Pretty much any implementation hashed lookup of keys and values is amenable to 
being compact and ordered.  Whatever existing logic that looks up an entry 
becomes a lookup into a table of indices which in turn references a sequential 
array of keys and values.  This logic is independent of hashing scheme or 
density, and it has no effect on the number of probes or collision rate.

The cost is an extra level of indirection and an extra array of indices 
(typically very small). The benefit is faster iteration over the smaller dense 
key/value array, overall memory savings resulting in improved cache 
utilization, and the side-effect of remembering insertion order.

Summary:  I think MicroPython will be just fine and if needed I will help 
create the patch that implements compact-and-ordered behavior.


Raymond




_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to