Re: [Sugar-devel] I'm looking for a tree...
Not sure if you have found an answer to this yet, but with more information about what you're trying to do could help with communicating the right solution. Again I think a hash table can answer your questions about fast-lookup, but it just depends your chosen key, and other requirements you might have about sorting. If its not too sugar specific, feel free to contact me directly. Regards, James. 2009/6/7 Benjamin M. Schwartz bmsch...@fas.harvard.edu Lucian Branescu wrote: Would an ordered dictionary otherwise be all right, or am I misunderstanding your requirements? There are other implementations, like http://www.xs4all.nl/~anthon/Python/ordereddict/http://www.xs4all.nl/%7Eanthon/Python/ordereddict/ No, an ordered dictionary is not enough. All these ordered dictionaries are ordered either by time or by sorting the keys. Neither will work for me. The problem is simple: if I insert something at position 5, I need the object currently at position 5 to move to position 6, and the object currently at position 6 to move to position 7, etc. To accomplish this in an time-ordered odict, I would have to remove all keys subsequent to the insertion point, make the insertion, and then re-insert those keys. That's O(n). To accomplish this in a sorted-order odict, I would have to generate a new key that causes my insertion to occur at the right location. If there are repeated insertions at the same location, generating such keys becomes an O(n) operation. To see this, suppose I am repeatedly inserting at the second position. Initially, the keys are (0,1). The first insertion has to pick a key between 0 and 1, e.g. 0.5. The second insertion has to pick a key between 0 and 0.5: e.g. 0.25. The number of bits required to store these keys to sufficient precision increases by one bit on each insertion. This means that the length of the keys is O(n), so every comparison is O(n). --Ben ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
I've had amazing difficulty communicating what I'm looking for here. Those closest thing is http://en.wikipedia.org/wiki/Rope_(computer_science) A rope is a binary tree that _imposes_ an ordering on its leaves that has nothing whatsoever to do with their values (the values are essentially opaque). This is very unusual; almost all standard tree structures _derive_ an ordering _from_ the values. Unfortunately, all implementations of Ropes that I have seen so far are designed only to store Chars, whereas I need to store arbitrary python objects. I'd like to avoid coding such a beast myself, but I may have no choice. It would help if someone could identify Ropes as a specialization of a more general type of tree, but I have not seen any claims of this kind. The only property I'm looking for that Ropes don't intrinsically satisfy is Reverse lookup. That is to say, I would like to be able to hold a pointer to a particular object in the tree, and at some later time, walk back up the tree from that pointer to work out the object's current index. It does seem like that should be doable with a Rope, especially if we move to the special case in which the leaves are arrays of length 1. --Ben signature.asc Description: OpenPGP digital signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
Hey Benjamin, Taking a guess with the information you've given perhaps a hash tablehttp://en.wikipedia.org/wiki/Hash_tablecould help? Fast lookup/retrieval, but just have to consider what you would enumerate as the key, and loading. Let me know if you want some help applying this, memories of a uni assignment have just come flooding back. James. 2009/6/7 Benjamin M. Schwartz bmsch...@fas.harvard.edu I am looking for a fast data structure with the following properties: Maintains an indexed list of arbitrary, non-ordered objects (like a python List or C array) Allows fast: Insertion at any location Deletion at any location Lookup of an object by its index Reverse lookup, to determine the index of an object Python's List has O(1) lookup, but O(N) insert, delete, and reverse-lookup. To make reverse lookup O(1) I could maintain a separate Dict mapping objects to indices, but this would cost an additional O(N) on every insertion and deletion. A linked list has O(1) insertion and deletion, but O(N) lookup and O(N) reverse lookup. I could maintain a separate Dict for the forward and reverse mappings, but this would cost O(N) on every insertion and deletion. A standard self-balancing tree cannot be used because the objects are not ordered, and self-balancing trees require ordered keys. I could use the index of an object as the sort key, but then insertion and deletion are O(N) because all subsequent keys must be altered. I could fabricate new sort keys to ensure that insertions occur at the desired location, but then the length of the keys will grow like O(N), making all operations at least O(N). I feel like there should be some kind of standard tree-like data structure that meets my requirements, but I can't find one. Do you know of one? Am I on a unicorn hunt? --Ben ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
James Zaki wrote: Taking a guess with the information you've given perhaps a hash tablehttp://en.wikipedia.org/wiki/Hash_tablecould help? Python uses the term Dict to describe its built-in hash table. I do think a hash table could be helpful, for example, to maintain the reverse lookup mapping if the forward lookup is stored in a List (which is actually python's version of a dynamic array, like C++'s Vector). However, I am still stuck with O(N) performance in this case for insertion and deletion, because each such modification shifts the position of all subsequent objects. This would correspond to changing the value associated with half the keys in the hash table, on average, for each insertion. I was hoping for a structure with log-time performance. --Ben signature.asc Description: OpenPGP digital signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
Have you looked into an awk associative array? I understand it is stored internally as a hash table. I remember reading about a simulated multidimensional array (index,subscript); inserting a record in that case didn't involve any reindexing. Older awks were considered slow though, I have no idea if that would be in the running performancewise. Sean On Sun, Jun 7, 2009 at 3:39 PM, Benjamin M. Schwartzbmsch...@fas.harvard.edu wrote: James Zaki wrote: Taking a guess with the information you've given perhaps a hash tablehttp://en.wikipedia.org/wiki/Hash_tablecould help? Python uses the term Dict to describe its built-in hash table. I do think a hash table could be helpful, for example, to maintain the reverse lookup mapping if the forward lookup is stored in a List (which is actually python's version of a dynamic array, like C++'s Vector). However, I am still stuck with O(N) performance in this case for insertion and deletion, because each such modification shifts the position of all subsequent objects. This would correspond to changing the value associated with half the keys in the hash table, on average, for each insertion. I was hoping for a structure with log-time performance. --Ben ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
This http://www.python.org/dev/peps/pep-0372/ might be interesting. Perhaps it could get backported to 2.5. But it still has O(n) deletion. 2009/6/7 Benjamin M. Schwartz bmsch...@fas.harvard.edu: James Zaki wrote: Taking a guess with the information you've given perhaps a hash tablehttp://en.wikipedia.org/wiki/Hash_tablecould help? Python uses the term Dict to describe its built-in hash table. I do think a hash table could be helpful, for example, to maintain the reverse lookup mapping if the forward lookup is stored in a List (which is actually python's version of a dynamic array, like C++'s Vector). However, I am still stuck with O(N) performance in this case for insertion and deletion, because each such modification shifts the position of all subsequent objects. This would correspond to changing the value associated with half the keys in the hash table, on average, for each insertion. I was hoping for a structure with log-time performance. --Ben ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
Lucian Branescu wrote: This http://www.python.org/dev/peps/pep-0372/ might be interesting. Perhaps it could get backported to 2.5. But it still has O(n) deletion. It also doesn't have insertion at all (only append), and indexing (and reverse indexing) is O(n). --Ben signature.asc Description: OpenPGP digital signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
hash chaining only approaches order n if the table is very full (because of collisions). On Sun, Jun 7, 2009 at 8:07 AM, Benjamin M. Schwartz bmsch...@fas.harvard.edu wrote: Lucian Branescu wrote: This http://www.python.org/dev/peps/pep-0372/ might be interesting. Perhaps it could get backported to 2.5. But it still has O(n) deletion. It also doesn't have insertion at all (only append), and indexing (and reverse indexing) is O(n). --Ben ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
Would an ordered dictionary otherwise be all right, or am I misunderstanding your requirements? There are other implementations, like http://www.xs4all.nl/~anthon/Python/ordereddict/ 2009/6/7 Benjamin M. Schwartz bmsch...@fas.harvard.edu: Lucian Branescu wrote: This http://www.python.org/dev/peps/pep-0372/ might be interesting. Perhaps it could get backported to 2.5. But it still has O(n) deletion. It also doesn't have insertion at all (only append), and indexing (and reverse indexing) is O(n). --Ben ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
Re: [Sugar-devel] I'm looking for a tree...
Lucian Branescu wrote: Would an ordered dictionary otherwise be all right, or am I misunderstanding your requirements? There are other implementations, like http://www.xs4all.nl/~anthon/Python/ordereddict/ No, an ordered dictionary is not enough. All these ordered dictionaries are ordered either by time or by sorting the keys. Neither will work for me. The problem is simple: if I insert something at position 5, I need the object currently at position 5 to move to position 6, and the object currently at position 6 to move to position 7, etc. To accomplish this in an time-ordered odict, I would have to remove all keys subsequent to the insertion point, make the insertion, and then re-insert those keys. That's O(n). To accomplish this in a sorted-order odict, I would have to generate a new key that causes my insertion to occur at the right location. If there are repeated insertions at the same location, generating such keys becomes an O(n) operation. To see this, suppose I am repeatedly inserting at the second position. Initially, the keys are (0,1). The first insertion has to pick a key between 0 and 1, e.g. 0.5. The second insertion has to pick a key between 0 and 0.5: e.g. 0.25. The number of bits required to store these keys to sufficient precision increases by one bit on each insertion. This means that the length of the keys is O(n), so every comparison is O(n). --Ben signature.asc Description: OpenPGP digital signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel
[Sugar-devel] I'm looking for a tree...
I am looking for a fast data structure with the following properties: Maintains an indexed list of arbitrary, non-ordered objects (like a python List or C array) Allows fast: Insertion at any location Deletion at any location Lookup of an object by its index Reverse lookup, to determine the index of an object Python's List has O(1) lookup, but O(N) insert, delete, and reverse-lookup. To make reverse lookup O(1) I could maintain a separate Dict mapping objects to indices, but this would cost an additional O(N) on every insertion and deletion. A linked list has O(1) insertion and deletion, but O(N) lookup and O(N) reverse lookup. I could maintain a separate Dict for the forward and reverse mappings, but this would cost O(N) on every insertion and deletion. A standard self-balancing tree cannot be used because the objects are not ordered, and self-balancing trees require ordered keys. I could use the index of an object as the sort key, but then insertion and deletion are O(N) because all subsequent keys must be altered. I could fabricate new sort keys to ensure that insertions occur at the desired location, but then the length of the keys will grow like O(N), making all operations at least O(N). I feel like there should be some kind of standard tree-like data structure that meets my requirements, but I can't find one. Do you know of one? Am I on a unicorn hunt? --Ben signature.asc Description: OpenPGP digital signature ___ Sugar-devel mailing list Sugar-devel@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel