Re: Proposed implementation for an Ordered Dictionary
Michele Simionato wrote: On Mar 1, 1:43 am, Paul Rubin http://phr...@nospam.invalid wrote: Colin J. Williams c...@ncf.ca writes: # print [mydict[x] for x in sorted(mydict.keys)] Instance object is not iterable It was a typo. Use: print [mydict[x] for x in sorted(mydict.keys())] Even better print [mydict[x] for x in sorted(mydict)] Both Paul Rubin and Michele Simionato produce the same result but neither produces what was originally suggested: def seqValues(self): ''' To return the values, with their keys, sorted by value. ''' v= [(it[1], it[0]) for it in self.items()] v.sort() return v Colin W. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
A sort of premature pessimization, then. QOTW! +1 Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Raymond Hettinger wrote: [Paul Rubin] Ehh, I guess I'm not surprised at the slowdown and extra complexity from the second dict. Oh well. If the module really turns out to be really used a lot, another (messy) approach would be to write a C extension that uses a doubly linked list some day. That seems like an ideal implementation to me. O(1): appending, popping, searching, and deletion O(n): traversal Raymond Sometimes, it's useful to be able to obtain the data in the sorted sequence. You might consider adding functionality like: def seqItems(self): '''To return the items, sorted by key. ''' return [self[k] for k in self.seqKeys()] def seqKeys(self): ''' To return the keys sorted. ''' k= self.keys() k.sort() return k def seqValues(self): ''' To return the values, with their keys, sorted by value. ''' v= [(it[1], it[0]) for it in self.items()] v.sort() return v Colin W. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Colin J. Williams wrote: Sometimes, it's useful to be able to obtain the data in the sorted sequence. You might consider adding functionality like: def seqItems(self): '''To return the items, sorted by key. ''' return [self[k] for k in self.seqKeys()] Amazingly, the Python time-machine provides such functionality! Using just a handful of pre-existing pieces, you can do this: sorted(mydict.items()) sorted(mydict.keys()) [mydict[x] for x in sorted(mydict.keys)] and any other sequence you might need. That's the beauty of a general purpose programming language. You don't need everything to be a built-in *wink* -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Steven D'Aprano wrote: Colin J. Williams wrote: Sometimes, it's useful to be able to obtain the data in the sorted sequence. You might consider adding functionality like: def seqItems(self): '''To return the items, sorted by key. ''' return [self[k] for k in self.seqKeys()] Amazingly, the Python time-machine provides such functionality! Using just a handful of pre-existing pieces, you can do this: sorted(mydict.items()) sorted(mydict.keys()) [mydict[x] for x in sorted(mydict.keys)] and any other sequence you might need. That's the beauty of a general purpose programming language. You don't need everything to be a built-in *wink* Thanks, you are right, you have a neat way of handling the first two, they work with an instance of Raymond Hettinger's. OrderedDict. The third suggestion has a couple of problems, which are fixed below. if __name__ == '__main__': d= OrderedDict.fromkeys('abracadabra', value= 'zzz') print(d) print(d.seqKeys()) print(d.seqItems()) print(d.seqValues()) # Steven D'Aprano's alternative mydict= d.copy() print sorted(mydict.items()) print sorted(mydict.keys()) # print [mydict[x] for x in sorted(mydict.keys)] Instance object is not iterable print(sorted(iter([(x[1], x[0]) for x in mydict.iteritems()]))) # This works Colin W. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Colin J. Williams c...@ncf.ca writes: # print [mydict[x] for x in sorted(mydict.keys)] Instance object is not iterable It was a typo. Use: print [mydict[x] for x in sorted(mydict.keys())] -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
On Mar 1, 1:43 am, Paul Rubin http://phr...@nospam.invalid wrote: Colin J. Williams c...@ncf.ca writes: # print [mydict[x] for x in sorted(mydict.keys)] Instance object is not iterable It was a typo. Use: print [mydict[x] for x in sorted(mydict.keys())] Even better print [mydict[x] for x in sorted(mydict)] -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Raymond Hettinger: Paul Rubin: another (messy) approach would be to write a C extension that uses a doubly linked list some day. That seems like an ideal implementation to me. This was my Python implementation, where the delete too is O(1), but it's slow: http://code.activestate.com/recipes/498195/ I think the C extension with the doubly linked list is the best approach. Note that in modern CPUs caches are able to change the performance a lot. So reducing the memory used is very important. So using the XOR (or subtraction) trick to use only 1 pointer for each key-value may be useful to keep performance not too much far from the normal python dicts: http://en.wikipedia.org/wiki/XOR_linked_list Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
bearophileh...@lycos.com writes: So using the XOR (or subtraction) trick to use only 1 pointer for each key-value may be useful to keep performance not too much far from the normal python dicts: http://en.wikipedia.org/wiki/XOR_linked_list I don't see how to delete a randomly chosen node if you use that trick, since the hash lookup doesn't give you two consecutive nodes in the linked list to xor together. Maybe you could do something intricate with skip lists and end up with O(log n) deletion. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Paul Rubin: I don't see how to delete a randomly chosen node if you use that trick, since the hash lookup doesn't give you two consecutive nodes in the linked list to xor together. Thank you, I think you are right, I am sorry. So on 32-bit CPUs you need to add 8 bytes to each value. On 64-bit CPUs you may use a double indexed list, instead of a double linked list. And each index can be stored in 6 bytes instead of 8 (281_474_976_710_656 buckets may be enough for most people), so you need only 12 bytes for two indexes instead of 16 bytes, this helps the L1 cache (and the small L2 cache too on Core i7) a bit. But this may slow down too much the ordered iteration. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
bearophileh...@lycos.com wrote: Paul Rubin: I don't see how to delete a randomly chosen node if you use that trick, since the hash lookup doesn't give you two consecutive nodes in the linked list to xor together. Thank you, I think you are right, I am sorry. So on 32-bit CPUs you need to add 8 bytes to each value. On 64-bit CPUs you may use a double indexed list, instead of a double linked list. And each index can be stored in 6 bytes instead of 8 (281_474_976_710_656 buckets may be enough for most people), so you need only 12 bytes for two indexes instead of 16 bytes, this helps the L1 cache (and the small L2 cache too on Core i7) a bit. But this may slow down too much the ordered iteration. A sort of premature pessimization, then. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Steve Holden: A sort of premature pessimization, then. Maybe not, the save in memory may lead to higher speed anyway. So you need to test it to know the overall balance. And in data structures with general purpose you want all the speed you can get. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
[Steve Holden] A sort of premature pessimization, then. QOTW! _ ~ @ @ \_/ Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Raymond Hettinger python at rcn.com writes: Here's a proposed implementation for Py2.7 and Py3.1: http://code.activestate.com/recipes/576669/ Would like you guys to kick the tires, exercise it a bit, and let me know what you think. The recipe runs under 2.6 and 3.0 without modification so it should be easy to play with. Why not just inherit from collections.MutableMapping? -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Raymond Hettinger pyt...@rcn.com writes: Here's a proposed implementation for Py2.7 and Py3.1: http://code.activestate.com/recipes/576669/ Several methods like __contains__() and __getitem__() are not overridden, so their performance is just as fast as a regular dictionary. Methods like __setitem__ and __delitem__ are overridden but have a fast path depending on whether or not the key already exists. It seems that __delitem__ of an existing key is O(n), whereas it's amortized constant time for dicts. (__setitem__ is constant time for both.) Is there a way to avoid this? If not, it should probably be documented, since it differs from dict. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
[Benjamin Peterson] Why not just inherit from collections.MutableMapping? It makes the recipe shorter to inherit some methods from dict. Also, subclassing from dict gives a speedup for __getitem__(), __len__(), and get(). -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
[Hrvoje Niksic] It seems that __delitem__ of an existing key is O(n), whereas it's amortized constant time for dicts. (__setitem__ is constant time for both.) Is there a way to avoid this? I don't see any fast/clean way. It's possible to tracking pending deletions and do them all at once but that's a bit messy and slow. If not, it should probably be documented, since it differs from dict. That makes sense. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Raymond Hettinger pyt...@rcn.com writes: I don't see any fast/clean way. It's possible to tracking pending deletions and do them all at once but that's a bit messy and slow. What about using a second dictionary (indexed by the incrementing counter) instead of a list to record the insertion order? Then you have O(1) deletion, and traversal takes an O(n log n) sorting operation. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
[Paul Rubin] What about using a second dictionary (indexed by the incrementing counter) instead of a list to record the insertion order? Then you have O(1) deletion, and traversal takes an O(n log n) sorting operation. My first attempt used exactly that approach and it works fine though it does complicate the code quite a bit and slows down all of the common operations by a constant factor. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
What about using a second dictionary (indexed by the incrementing counter) instead of a list to record the insertion order? Then you have O(1) deletion, and traversal takes an O(n log n) sorting operation. My first attempt used exactly that approach and it works fine though it does complicate the code quite a bit and slows down all of the common operations by a constant factor. FWIW, here is the code for that approach. -- from collections import MutableMapping class OrderedDict(dict): def __init__(self, *args, **kwds): if len(args) 1: raise TypeError('expected at 1 argument, got %d', len (args)) self.clear() self.update(*args, **kwds) def clear(self): self.__cached_sort = None self.__cnt = 0 self.__order = {} dict.clear(self) def __setitem__(self, key, value): if key not in self: self.__cached_sort = None self.__cnt += 1 self.__order[key] = self.__cnt dict.__setitem__(self, key, value) def __delitem__(self, key): if key in self: self.__cached_sort = None del self.__order[key] dict.__delitem__(self, key) def __sorted(self): # return keys in insertion order and cache the result if self.__cached_sort is None: self.__cached_sort = sorted(dict.keys(self), key=self.__order.__getitem__) return self.__cached_sort def __iter__(self): return iter(self.__sorted()) def __reversed__(self): return iter(reversed(self.__sorted())) def popitem(self): # O(n) cost on the first pop and O(1) on successive pops if not self: raise KeyError key = self.__sorted().pop() del self.__order[key] value = dict.pop(self, key) return key, value # Methods with indirect access via the above methods setdefault = MutableMapping.setdefault update = MutableMapping.update pop = MutableMapping.pop keys = MutableMapping.keys values = MutableMapping.values items = MutableMapping.items def __repr__(self): return '{' + ', '.join(map('%r: %r'.__mod__, self.items())) + '}' def copy(self): return self.__class__(self) @classmethod def fromkeys(cls, iterable, value=None): d = cls() for key in iterable: d[key] = value return d def __reduce__(self): # pickle an items list and any extra info in the instance dict items = list(self.items()) inst_dict = vars(self).copy() for attr in vars(self.__class__): inst_dict.pop(attr, None) return (self.__class__, (items,), inst_dict) -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
Raymond Hettinger pyt...@rcn.com writes: What about using a second dictionary My first attempt used exactly that approach and it works fine though it does complicate the code quite a bit and slows down all of the common operations by a constant factor. Ehh, I guess I'm not surprised at the slowdown and extra complexity from the second dict. Oh well. If the module really turns out to be really used a lot, another (messy) approach would be to write a C extension that uses a doubly linked list some day. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposed implementation for an Ordered Dictionary
[Paul Rubin] Ehh, I guess I'm not surprised at the slowdown and extra complexity from the second dict. Oh well. If the module really turns out to be really used a lot, another (messy) approach would be to write a C extension that uses a doubly linked list some day. That seems like an ideal implementation to me. O(1): appending, popping, searching, and deletion O(n): traversal Raymond -- http://mail.python.org/mailman/listinfo/python-list