[Steve Holden] > The collective brainpower that's been exercised on this one enhancement > already must be phenomenal, but the proposal still isn't perfect.
Sure it is :-) It was never intended to replace all string parsing functions, existing or contemplated. We still have str.index() so people can do low-level index tracking, optimization, or whatnot. Likewise, str.split() and regexps remain better choices for some apps. The discussion has centered around the performance cost of returning three strings when two or fewer are actually used. >From that discussion, we can immediately eliminate the center string case as it is essentially cost-free (it is either an empty string or a reference to, not a copy of the separator argument). Another case that is no cause for concern is when one of the substrings is often, but not always empty. Consider comments stripping for example: # XXX a real parser would need to skip over # in string literals for line in open('demo.py'): line, sep, comment = line.partition('#') print line On most lines, the comment string is empty, so no time is lost copying a long substring that won't be used. On the lines with a comment, I like having it because it makes the code easier to debug/maintain (making it trivial to print, log, or store the comment string). Similar logic applies to other cases where the presence of a substring is an all or nothing proposition, such as cgi scripts extracting the command string when present: line, cmdfound, command = line.rpartition('?'). If not found, you've wasted nothing (the command string is empty). If found, you've gotten what you were going to slice-out anyway. Also, there are plenty of use cases that only involve short strings (parsing urls, file paths, splitting name/value pairs, etc). The cost of ignoring a component for these short inputs is small. That leaves the case where the strings are long and parsing is repeated with the same separator. The answer here is to simply NOT use partition(). Don't write: while s: line, _, s = s.partition(sep) . . . Instead, you almost always do better with for line in s.split(sep): . . . or with re.finditer() if memory consumption is an issue. Remember, adding partition() doesn't take away anything else you have now (even if str.find() disappears, you still have str.index()). Also, its inclusion does not preclude more specialized methods like str.before(sep) or str.after(sep) if someone is able to prove their worth. What str.partition() does do well is simplify code by encapsulating several variations of a common multi-step, low-level programming pattern. It should be accepted on that basis rather than being rejected because it doesn't also replace re.finditer() or str.split(). Because there are so many places were partition() is a clear improvement, I'm not bothered when someone concocts a case where it isn't the tool of choice. Accept it for what it is, not what it is not. Raymond _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com