Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Gregory P. Smith wrote: I've never liked the .join([]) idiom for string concatenation; in my opinion it violates the principles Beautiful is better than ugly. and There should be one-- and preferably only one --obvious way to do it.. (And perhaps several others.) To that end I've submitted patch #1569040 to SourceForge: http://sourceforge.net/tracker/index.php?func=detailaid=1569040group_id=5470atid=305470 This patch speeds up using + for string concatenation. yay! i'm glad to see this. i hate the .join syntax. i still write that as string.join() [...] instance.method(*args) == type.method(instance, *args) You can nowadays spell this as str.join(, lst) - no need to import a whole module! regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Steve Holden wrote: instance.method(*args) == type.method(instance, *args) You can nowadays spell this as str.join(, lst) - no need to import a whole module! except that str.join isn't polymorphic: str.join(u,, [1, 2, 3]) Traceback (most recent call last): File stdin, line 1, in module TypeError: descriptor 'join' requires a 'str' object but received a 'unicode' string.join([1, 2, 3], u,) u'1,2,3' /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Greg have you run any generic benchmarks such as pystone to get a Greg better idea of what the net effect on typical python code is? MAL's pybench would probably be better for this presuming it does some addition with string operands. Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
[EMAIL PROTECTED] wrote: Greg have you run any generic benchmarks such as pystone to get a Greg better idea of what the net effect on typical python code is? MAL's pybench would probably be better for this presuming it does some addition with string operands. or stringbench. /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Gregory P. Smith wrote: I've never liked the .join([]) idiom for string concatenation; in my opinion it violates the principles Beautiful is better than ugly. and There should be one-- and preferably only one --obvious way to do it.. (And perhaps several others.) To that end I've submitted patch #1569040 to SourceForge: http://sourceforge.net/tracker/index.php?func=detailaid=1569040group_id=5470atid=305470 This patch speeds up using + for string concatenation. yay! i'm glad to see this. i hate the .join syntax. i still write that as string.join() because thats at least readable). it also fixes the python idiom for fast string concatenation as intended; anyone whos ever written code that builds a large string value by pushing substrings into a list only to call join later should agree. Well I always like things to run faster, but I disagree that this idiom is broken. I like using lists to store sub strings and I think it's just a matter of changing your frame of reference in how you think about them. For example it doesn't bother me to have an numeric type with many digits, and to have lists of many, many digit numbers, and work with those. Working with lists of many character strings is not that different. I've even come to the conclusion (just my opinion) that mutable lists of strings probably would work better than a long mutable string of characters in most situations. What I've found is there seems to be an optimum string length depending on what you are doing. Too long (hundreds or thousands of characters) and repeating some string operations (not just concatenations) can be slow (relative to short strings), and using many short (single character) strings would use more memory than is needed. So a list of medium length strings is actually a very nice compromise. I'm not sure what the optimal strings length is, but lines of about 80 columns seems to work very well for most things. I think what may be missing is a larger set of higher level string functions that will work with lists of strings directly. Then lists of strings can be thought of as a mutable string type by its use, and then working with substrings in lists and using ''.join() will not seem as out of place. So maybe instead of splitting, modifying, then joining, (and again, etc ...), just pass the whole list around and have operations that work directly on the list of strings and return a list of strings as the result. Pretty much what the Patch does under the covers, but it only works with concatenation. Having more functions that work with lists of strings directly will reduce the need for concatenation as well. Some operations that could work well with whole lists of strings of lines may be indent_lines, dedent_lines, prepend_lines, wrap_lines, and of course join_lines as in '\n'.join(L), the inverse of s.splitlines(), and there also readlines() and writelines(). Also possilby find_line or find_in_lines(). These really shouldn't seem anymore out of place than numeric operations that work with lists such as sum, max, and min. So to me... .join(L) as a string operation that works on a list of strings seems perfectly natural. :-) Cheers, Ron ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Ron Adam wrote: I think what may be missing is a larger set of higher level string functions that will work with lists of strings directly. Then lists of strings can be thought of as a mutable string type by its use, and then working with substrings in lists and using ''.join() will not seem as out of place. as important is the observation that you don't necessarily have to join string lists; if the data ends up being sent over a wire or written to disk, you might as well skip the join step, and work directly from the list. (it's no accident that ET has grown tostringlist and fromstringlist functions, for example ;-) /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
I was looking at the logs for classobject.c and noticed this commit that adds Py_TPFLAGS_HAVE_WEAKREFS to the instance type. Should it be backported to 2.4? (It looks to me like it should, but I don't know anything about weakref implementation and want to get approval from someone who knows.) --amk r39038 | rhettinger | 2005-06-19 04:42:20 -0400 (Sun, 19 Jun 2005) | 2 lines Insert missing flag. Index: classobject.c === --- classobject.c (revision 39037) +++ classobject.c (revision 39038) @@ -2486,7 +2486,7 @@ (getattrofunc)instancemethod_getattro, /* tp_getattro */ PyObject_GenericSetAttr,/* tp_setattro */ 0, /* tp_as_buffer */ - Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */ + Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_HAVE_WEAKREFS, /* tp_flags */ instancemethod_doc, /* tp_doc */ (traverseproc)instancemethod_traverse, /* tp_traverse */ 0, /* tp_clear */ svn merge -r 39037:39038 svn+ssh://[EMAIL PROTECTED]/python/trunk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
No need to backport. Py_TPFLAGS_DEFAULT implies Py_TPFLAGS_HAVE_WEAKREFS. The change was for clarity -- most things that have the weakref slots filled-in will also make the flag explicit -- that makes it easier on the brain when verifying code that checks the weakref flag. Raymond -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of A.M. Kuchling Sent: Friday, October 06, 2006 6:41 AM To: python-dev@python.org Subject: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS? I was looking at the logs for classobject.c and noticed this commit that adds Py_TPFLAGS_HAVE_WEAKREFS to the instance type. Should it be backported to 2.4? (It looks to me like it should, but I don't know anything about weakref implementation and want to get approval from someone who knows.) --amk r39038 | rhettinger | 2005-06-19 04:42:20 -0400 (Sun, 19 Jun 2005) | 2 lines Insert missing flag. Index: classobject.c === --- classobject.c (revision 39037) +++ classobject.c (revision 39038) @@ -2486,7 +2486,7 @@ (getattrofunc)instancemethod_getattro, /* tp_getattro */ PyObject_GenericSetAttr,/* tp_setattro */ 0, /* tp_as_buffer */ - Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */ + Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_HAVE_WEAKREFS, /* tp_flags */ instancemethod_doc, /* tp_doc */ (traverseproc)instancemethod_traverse, /* tp_traverse */ 0, /* tp_clear */ svn merge -r 39037:39038 svn+ssh://[EMAIL PROTECTED]/python/trunk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/rhettinger%40ewtllc.co m ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Fredrik Lundh [EMAIL PROTECTED] wrote: Ron Adam wrote: I think what may be missing is a larger set of higher level string functions that will work with lists of strings directly. Then lists of strings can be thought of as a mutable string type by its use, and then working with substrings in lists and using ''.join() will not seem as out of place. as important is the observation that you don't necessarily have to join string lists; if the data ends up being sent over a wire or written to disk, you might as well skip the join step, and work directly from the list. (it's no accident that ET has grown tostringlist and fromstringlist functions, for example ;-) I've personally added a line-based abstraction with indent/dedent handling, etc., for the editor I use, which helps make macros and underlying editor functionality easier to write. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.4.4: backport classobject.c HAVE_WEAKREFS?
On Fri, Oct 06, 2006 at 08:48:15AM -0700, Raymond Hettinger wrote: The change was for clarity -- most things that have the weakref slots filled-in will also make the flag explicit -- that makes it easier on the brain when verifying code that checks the weakref flag. OK; I won't backport this. Thanks! --amk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
On 10/6/06, Fredrik Lundh [EMAIL PROTECTED] wrote: Ron Adam wrote: I think what may be missing is a larger set of higher level string functions that will work with lists of strings directly. Then lists of strings can be thought of as a mutable string type by its use, and then working with substrings in lists and using ''.join() will not seem as out of place. as important is the observation that you don't necessarily have to join string lists; if the data ends up being sent over a wire or written to disk, you might as well skip the join step, and work directly from the list. (it's no accident that ET has grown tostringlist and fromstringlist functions, for example ;-) The just make lists paradigm is used by Erlang too, it's called iolist there (it's not a type, just a convention). The lists can be nested though, so concatenating chunks of data for IO is always a constant time operation even if the chunks are already iolists. -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Josiah Carlson wrote: Fredrik Lundh [EMAIL PROTECTED] wrote: Ron Adam wrote: I think what may be missing is a larger set of higher level string functions that will work with lists of strings directly. Then lists of strings can be thought of as a mutable string type by its use, and then working with substrings in lists and using ''.join() will not seem as out of place. as important is the observation that you don't necessarily have to join string lists; if the data ends up being sent over a wire or written to disk, you might as well skip the join step, and work directly from the list. (it's no accident that ET has grown tostringlist and fromstringlist functions, for example ;-) I've personally added a line-based abstraction with indent/dedent handling, etc., for the editor I use, which helps make macros and underlying editor functionality easier to write. - Josiah I've done the same thing just last week. I've started to collect them into a module called stringtools, but I see no reason why they can't reside in the string module. I think this may be just a case of collecting these type of routines together in one place so they can be reused easily because they already are scattered around pythons library in some form or another. Another tool I found tucked away within a pydoc is the console pager that is used in pydoc. I think it could easily be a separate module it self. And it benefits from the line-based abstraction as well. Cheers, Ron ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
On 6 Oct 2006, at 12:37, Ron Adam wrote: I've never liked the .join([]) idiom for string concatenation; in my opinion it violates the principles Beautiful is better than ugly. and There should be one-- and preferably only one --obvious way to do it.. ... Well I always like things to run faster, but I disagree that this idiom is broken. I like using lists to store sub strings and I think it's just a matter of changing your frame of reference in how you think about them. I think that you've hit on exactly the reason why this patch is a good idea. You happen to like to store strings in lists, and in many situations this is a fine thing to do, but if one is forced to change ones frame of reference in order to get decent performance then as well as violating the maxims Larry originally cited you're also hitting both readability counts and Correctness and clarity before speed. The .join(L) idiom is not broken in the sense that, to the fluent Python programmer, it does convey the intent as well as the action. That said, there are plenty of places that you'll see it not being used because it fails to convey the intent. It's pretty rare to see someone write: for k,v in d.items(): print has value: .join([k,v]) but, despite the utility of the % operator on strings it's pretty common to see: print k + has value: + v This patch _seems_ to be able to provide better performance for this sort of usage and provide a major speed-up for some other common usage forms without causing the programmer to resort making their code more complicated. The cost seems to be a small memory hit on the size of a string object, a tiny increase in code size and some well isolated, under-the-hood complexity. It's not like having this patch is going to force anyone to change the way they write their code. As far as I can tell it simply offers better performance if you choose to express your code in some common ways. If it speeds up pystone by 5.5% with such minimal down side I'm hard pressed to see a reason not to use it. Cheers, Nicko ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Weekly Python Patch/Bug Summary
Patch / Bug Summary ___ Patches : 428 open ( +6) / 3417 closed ( +2) / 3845 total ( +8) Bugs: 939 open ( +6) / 6229 closed (+17) / 7168 total (+23) RFE : 240 open ( +3) / 239 closed ( +0) / 479 total ( +3) New / Reopened Patches __ Speed up using + for string concatenation (2006-10-02) http://python.org/sf/1569040 opened by Larry Hastings Speed-up in array_repeat() (2006-10-02) http://python.org/sf/1569291 opened by Lars Skovlund Fix building the source within exec_prefix (2006-10-03) http://python.org/sf/1569798 opened by Matthias Klose distutils - python 2.5 vc8 - non working setup (2006-10-03) CLOSED http://python.org/sf/1570119 opened by Grzegorz Makarewicz Fix for compilation errors in the 2.4 branch (2006-10-03) CLOSED http://python.org/sf/1570253 opened by iga Seilnacht qtsupport.py mistake leads to bad _Qt module (2006-10-04) http://python.org/sf/1570672 opened by Jeff Senn Generate numeric/space/linebreak from Unicode database. (2006-10-05) http://python.org/sf/1571184 opened by Anders Chrigström make trace.py --ignore-dir work (2006-10-05) http://python.org/sf/1571379 opened by Clinton Roy Patches Closed __ distutils - python 2.5 vc8 - non working setup (2006-10-03) http://python.org/sf/1570119 closed by loewis Fix for compilation errors in the 2.4 branch (2006-10-03) http://python.org/sf/1570253 closed by loewis New / Reopened Bugs ___ Test for uintptr_t seems to be incorrect (2006-10-01) CLOSED http://python.org/sf/1568842 opened by Ronald Oussoren http redirect does not pass 'post' data (2006-10-02) CLOSED http://python.org/sf/1568897 opened by hans_moleman 'all' documentation missing online (2006-09-26) CLOSED http://python.org/sf/1565797 reopened by aisaac0 Using .next() on file open in write mode writes junk to file (2006-10-01) http://python.org/sf/1569057 opened by andrei kulakov External codecs no longer usable (2006-10-02) CLOSED http://python.org/sf/1569084 opened by Ivan Vilata i Balaguer sys.settrace cause curried parms to show up as attributes (2006-10-02) http://python.org/sf/1569356 opened by applebucks sys.settrace cause curried parms to show up as attributes (2006-10-02) CLOSED http://python.org/sf/1569374 opened by applebucks PGIRelease linkage fails on pgodb80.dll (2006-10-02) http://python.org/sf/1569517 opened by Coatimundi Backward incompatibility in logging.py (2006-10-02) CLOSED http://python.org/sf/1569622 opened by Mike Klaas datetime.datetime subtraction bug (2006-10-02) CLOSED http://python.org/sf/1569623 opened by David Fugate mailbox.Maildir.get_folder() loses factory information (2006-10-03) http://python.org/sf/1569790 opened by Matthias Klose distutils don't respect standard env variables (2006-10-03) CLOSED http://python.org/sf/1569886 opened by Lukas Lalinsky 2.5 incorrectly permits break inside try statement (2006-10-04) CLOSED http://python.org/sf/1569998 opened by Nick Coghlan redirected cookies (2006-10-04) http://python.org/sf/1570255 opened by hans_moleman Launcher reset to factory button provides bad command-line (2006-10-03) http://python.org/sf/1570284 opened by jjackson 2.4 2.5 can't create win installer on linux (2006-10-04) http://python.org/sf/1570417 opened by Richard Jones _ssl module can't be built on windows (2006-10-05) CLOSED http://python.org/sf/1571023 opened by iga Seilnacht simple moves freeze IDLE (2006-10-04) http://python.org/sf/1571112 opened by Douglas W. Goodall Some numeric characters are still not recognized (2006-10-05) http://python.org/sf/1571170 opened by Anders Chrigström round() producing -0.0 (2006-10-05) CLOSED http://python.org/sf/1571620 opened by Ron Frye Building using Sleepycat db 4.5.20 is broken (2006-10-05) http://python.org/sf/1571754 opened by Robert Scheck email module does not complay with RFC 2046: CRLF issue (2006-10-05) http://python.org/sf/1571841 opened by Andy Leszczynski .eml attachments in email (2006-10-06) http://python.org/sf/1572084 opened by rainwolf8472 parser stack overflow (2006-10-06) http://python.org/sf/1572320 opened by jürgen urner csv dialect = 'excel-tab' to use excel_tab (2006-10-06) http://python.org/sf/1572471 opened by Dan Goldner Bugs Closed ___ Test for uintptr_t seems to be incorrect (2006-10-01) http://python.org/sf/1568842 closed by loewis http redirect does not pass 'post' data (2006-10-01) http://python.org/sf/1568897 closed by loewis Spurious Tabnanny error (2006-09-21) http://python.org/sf/1562716 closed by kbk Spurious Tab/space error (2006-09-21) http://python.org/sf/1562719 closed by kbk plistlib should be moved out of plat-mac (2003-07-29)
Re: [Python-Dev] PATCH submitted: Speed up + for string concatenation, now as fast as .join(x) idiom
Nicko van Someren wrote: On 6 Oct 2006, at 12:37, Ron Adam wrote: I've never liked the .join([]) idiom for string concatenation; in my opinion it violates the principles Beautiful is better than ugly. and There should be one-- and preferably only one --obvious way to do it.. ... Well I always like things to run faster, but I disagree that this idiom is broken. I like using lists to store sub strings and I think it's just a matter of changing your frame of reference in how you think about them. I think that you've hit on exactly the reason why this patch is a good idea. You happen to like to store strings in lists, and in many situations this is a fine thing to do, but if one is forced to change ones frame of reference in order to get decent performance then as well as violating the maxims Larry originally cited you're also hitting both readability counts and Correctness and clarity before speed. The statement .. if one is forced to change .. is a bit overstated I think. The situation is more a matter of increasing awareness so the frame of reference comes to mind more naturally and doesn't seem forced. And the suggestion of how to do that is by adding additional functions and methods that can use lists-of-strings instead of having to join or concatenate them first. Added examples and documentation can also do that as well. The two ideas are non-competing. They are related because they realize their benefits by reducing redundant underlying operations in a similar way. Cheers, Ron ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com