Re: [Python-Dev] Document performance requirements?
On Friday 21 July 2006 7:49 am, Nick Coghlan wrote: Neal Becker wrote: For a recent project I needed to select a container. There are plenty of python data structures to choose from. It seems that information on performance is missing (or not easy to find). I think Python should include performance in the documentation of common data structures to help users select the appropriate types. Something in the style of c++ STL. Do you mean absolute performance, or do you mean algorithmic order guarantees? I thought the latter were already documented. . . The latter. Where are they documented? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Hi Giovanni, On Sun, Jul 23, 2006 at 03:30:50PM +0200, Giovanni Bajo wrote: I'm not sure big-O tells the whole truth. For instance, do we want to allow an implementation to use a hash table as underlying type for a list? It would match big-O requirements, but would still be slower than a plain array because of higher overhead of implementation (higher constant factor). A big-O difference can make the difference between a program that takes 0.5 seconds or 2 hours to run. This is more important than a constant factor difference, which different implementations are bound to exhibit anyway. And if this is allowed, I would like to find in CPython tutorials and documentations a simple statement like: to implement the list and match its requirements, CPython choose a simple array as underlying data structure. Yes, the big-O notes don't have to be too technical: the docs should tell people to think about Python lists as simple arrays, and the O requirements follow naturally. A bientot, Armin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Between the two of you, I think you have made the case that the language specification is better to not include such details. As you both note, it is difficult to capture the essence of what is desired from the performance of the implementation. To tag on other version, what about Big-O space concerns with things like list.sort. I'm sure there are other things to add as well. It seems reasonable to me that everyone has the same interests in mind when they write a program. Make it good, make it fast, make it small, etc. These sort of details should work themselves out if they are actually important. All of these algorithms should be treated as implementation accidents. Having the information about CPython's implementation in the docs would be good. And go most of the way towards having everyone on the same page. -- Scott Dial [EMAIL PROTECTED] [EMAIL PROTECTED] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Armin Rigo wrote: I think that O-wise the current CPython situation should be documented as a minimal requirement for implementations of the language, with just one exception: the well-documented don't rely on this hack in 2.4 to make repeated 'str += str' amortized linear, for which the 2.3 quadratic behavior is considered compliant enough. I suppose that allowing implementations to provide better algorithmic complexities than required is fine, although I can think of some problems with that (e.g. nice and efficient user code that would perform horribly badly on CPython). I'm not sure big-O tells the whole truth. For instance, do we want to allow an implementation to use a hash table as underlying type for a list? It would match big-O requirements, but would still be slower than a plain array because of higher overhead of implementation (higher constant factor). And if this is allowed, I would like to find in CPython tutorials and documentations a simple statement like: to implement the list and match its requirements, CPython choose a simple array as underlying data structure. -- Giovanni Bajo ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Jason Orendorff wrote: On 7/21/06, Nick Coghlan [EMAIL PROTECTED] wrote: However, I'm also struggling to think of a case other than list vs deque where the choice of a builtin or standard library data structure would be dictated by big-O() concerns. OK, but that doesn't mean the information is unimportant. +1 on making this something of a priority. People looking for this info should find it in the obvious place. Some are unobvious. (How fast is dict.__eq__ on average? Worst case?) Contributions are welcome. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Hi, On Sat, Jul 22, 2006 at 12:33:45PM +1000, Nick Coghlan wrote: Agreed, but there's more to doing that than just writing down the O() implied by the current CPython implementation - it's up to Guido to decide which of the constraints are part of the language definition, and which are implementation accidents I think that O-wise the current CPython situation should be documented as a minimal requirement for implementations of the language, with just one exception: the well-documented don't rely on this hack in 2.4 to make repeated 'str += str' amortized linear, for which the 2.3 quadratic behavior is considered compliant enough. I suppose that allowing implementations to provide better algorithmic complexities than required is fine, although I can think of some problems with that (e.g. nice and efficient user code that would perform horribly badly on CPython). Armin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Document performance requirements?
For a recent project I needed to select a container. There are plenty of python data structures to choose from. It seems that information on performance is missing (or not easy to find). I think Python should include performance in the documentation of common data structures to help users select the appropriate types. Something in the style of c++ STL. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Neal Becker wrote: For a recent project I needed to select a container. There are plenty of python data structures to choose from. It seems that information on performance is missing (or not easy to find). I think Python should include performance in the documentation of common data structures to help users select the appropriate types. Something in the style of c++ STL. Do you mean absolute performance, or do you mean algorithmic order guarantees? I thought the latter were already documented. . . Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Nick Coghlan wrote: Neal Becker wrote: For a recent project I needed to select a container. There are plenty of python data structures to choose from. It seems that information on performance is missing (or not easy to find). I think Python should include performance in the documentation of common data structures to help users select the appropriate types. Something in the style of c++ STL. Do you mean absolute performance, or do you mean algorithmic order guarantees? I thought the latter were already documented. . . The latter. Where is it documented? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Neal Becker wrote: On Friday 21 July 2006 7:49 am, Nick Coghlan wrote: Neal Becker wrote: For a recent project I needed to select a container. There are plenty of python data structures to choose from. It seems that information on performance is missing (or not easy to find). I think Python should include performance in the documentation of common data structures to help users select the appropriate types. Something in the style of c++ STL. Do you mean absolute performance, or do you mean algorithmic order guarantees? I thought the latter were already documented. . . The latter. Where are they documented? Just because I think something, it doesn't mean it's true :) The only reference I can actually find is the one in the collections module docs pointing out that collections.deque permits O(1) insertions and removals at the beginning of the sequence, as well as at the end (whereas lists are O(n) for operations at the beginning due to the resulting memory copying). However, I'm also struggling to think of a case other than list vs deque where the choice of a builtin or standard library data structure would be dictated by big-O() concerns. list vs array.array is based on memory efficiency list vs deque is based on whether or not you need O(1) push/pop at both ends list vs set is based on whether or not ordering matters set vs dict is based on whether or not you need to map keys to values Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Document performance requirements?
On 7/21/06, Nick Coghlan [EMAIL PROTECTED] wrote: However, I'm also struggling to think of a case other than list vs deque where the choice of a builtin or standard library data structure would be dictated by big-O() concerns. OK, but that doesn't mean the information is unimportant. +1 on making this something of a priority. People looking for this info should find it in the obvious place. Some are unobvious. (How fast is dict.__eq__ on average? Worst case?) -j ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Jason Orendorff wrote: However, I'm also struggling to think of a case other than list vs deque where the choice of a builtin or standard library data structure would be dictated by big-O() concerns. OK, but that doesn't mean the information is unimportant. +1 on making this something of a priority. People looking for this info should find it in the obvious place. Some are unobvious. (How fast is dict.__eq__ on average? Worst case?) I also found out that most people tend to think of Python's lists as a magical data structure optimized for many operations (like a rope or something complex like that). Documenting that it's just a bare vector (std::vector in C++) would be of great help. -- Giovanni Bajo ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
On Jul 21, 2006, at 12:45 PM, Giovanni Bajo wrote: Jason Orendorff wrote: However, I'm also struggling to think of a case other than list vs deque where the choice of a builtin or standard library data structure would be dictated by big-O() concerns. OK, but that doesn't mean the information is unimportant. +1 on making this something of a priority. People looking for this info should find it in the obvious place. Some are unobvious. (How fast is dict.__eq__ on average? Worst case?) I also found out that most people tend to think of Python's lists as a magical data structure optimized for many operations (like a rope or something complex like that). Documenting that it's just a bare vector (std::vector in C++) would be of great help. Indeed, I was talking to someone a while back who thought that lists were magically hashed, in that he did something like: dictionary = open(/usr/share/dict/words).readlines() and then expected: word in dictionary would be fast. And was very surprised when it turned out to be slow a linear search of the list. :) James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Document performance requirements?
Jason Orendorff wrote: On 7/21/06, Nick Coghlan [EMAIL PROTECTED] wrote: However, I'm also struggling to think of a case other than list vs deque where the choice of a builtin or standard library data structure would be dictated by big-O() concerns. OK, but that doesn't mean the information is unimportant. +1 on making this something of a priority. People looking for this info should find it in the obvious place. Some are unobvious. (How fast is dict.__eq__ on average? Worst case?) Agreed, but there's more to doing that than just writing down the O() implied by the current CPython implementation - it's up to Guido to decide which of the constraints are part of the language definition, and which are implementation accidents (e.g. CPython's list.sort() operation was stable for at least one release before GvR made stability part of the definition of the method at the language level). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com