[issue7008] str.title() misbehaves with apostrophes

2009-09-28 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

I agree with the OP that str.title should be made smarter.  As it
stands, it is a likely bug factory that would pass unittests, then
generate unpleasant results with real user inputs.

Extending on Thomas's comment, I think string.capwords() needs to be
deprecated and eliminated.  It is an egregious hack that has unfortunate
effects such as dropping runs for repeated spaces and incorrectly
handling strings in quotes.  

As it stands, we have two methods that both don't quite do what we would
really want in a title casing method (correct handling of apostrophe's
and quotation marks, keeping the string length unchanged, and only
changing desired letters from lower to uppercase with no other
side-effects).

--
nosy: +rhettinger
versions: +Python 2.7, Python 3.2 -Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7008
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7012] Tabs is better than spaces for identation

2009-09-28 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

Wow. You (rohdef) really do sound like you are a time capsule from the 
eighties. Tabs would save keystrokes and bandwidth, and are not 
confusing? The keystrokes argument is wrong for most editors; the 
bandwidth argument doesn't matter due to disk size, network speed, and 
compression; and the confusion is absolutely real. Pretty much every 
time I volunteer to help out a group of Python newbies there is at least 
one baffling problem due to tabs/spaces. There are hundreds of different 
text editors that people use on a regular basis to edit Python source 
code. They all display spaces the same way; not so for tabs. Most of 
them have configurable behavior for tabs, and most of the time the users 
are not aware of even the existence of those settings, let alone what 
setting is currently being used.

--
nosy: +gvanrossum

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7012
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7008] str.title() misbehaves with apostrophes

2009-09-28 Thread Thomas W. Barr

Thomas W. Barr t...@rice.edu added the comment:

If correct handling of apostrophe's and quotation marks, keeping the
string length unchanged, and only changing desired letters from lower to
uppercase with no other side-effects is the criterion we want, then
what I suggested (toupper() the first character, and any character that
follows a space or punctuation character) should work. (Unless I'm
missing something.) Do we want to tolower() all other characters, like
the interpreter does now?

I can make a test and patch for this if this is what we decide.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7008
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7008] str.title() misbehaves with apostrophes

2009-09-28 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

I'm still researching what other languages do.  MS-Excel matches what
Python currently does.  Django uses the python version and then fixes-up
apostrophe errors:  
title=lambda value: re.sub(([a-z])'([A-Z]), lambda m:
m.group(0).lower(), value.title()).   

It would also be nice to handle hyphenates like xray -- X-ray.   

Am thinking that it would be nice if the user could pass-in an optional
argument to list all desired characters to prevent transitions (such as
apostrophes and hyphens).

A broader solution would be to replace string.capwords() with a more
sophisticated set of rules that generally match what people are really
trying to accomplish with title casing:  

   http://aitech.ac.jp/~ckelly/midi/help/caps.html

   http://search.cpan.org/dist/Text-Capitalize/Capitalize.pm

   Headline Style in the Chicago Manual of Style or 
   Associate Pressd Stylebook:  
  
http://grammar.about.com/b/2008/04/11/rules-for-capitalizing-the-words-in-a-title.htm

Any such attempt at a broad solution needs to provide ways for users to
modify the list of exception words and options for quoted text.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7008
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7008] str.title() misbehaves with apostrophes

2009-09-28 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

Raymond, please refrain from emotional terms like bug factory.

I have nothing to say about whether string.capwords() should be removed, 
but I want to note that it does a split on whitespace and then rejoins 
using a single space, so that string.capwords('A  B\tC\r\nD') returns 'A 
B C D'.

The title() method exists primarily because the Unicode standard has a 
definition of title case.  I wouldn't want to change its default 
behavior because there is no reasonable behavior that isn't locale-
dependent, and Unicode methods shouldn't depend on locale; and even then 
it won't be perfect, as the O'Brien example shows.

Also note that .title() matches .istitle() in the sense that 
x.title().istitle() is supposed to be true (except in end cases like a 
string containing no letters).

I worry that providing an API that adds a way to specify a set of 
characters to be treated as letters (for the purpose of deciding where 
words start) will just make the bugs in apps harder to find because the 
examples are rarer (like l'Aperitif or O'Brien -- or RSVP for that 
matter).  With the current behavior at least app authors will easily 
notice the problem, decide whether it matters to them, and implement 
their own algorithm if they do.  And they are free to be as elaborate or 
simplistic as they care.

What's a realistic use case for .title() anyway?

(Proposal: close as won't fix.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7008
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6790] httplib and array do not play together well

2009-09-28 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Why do you need to give a non-empty body to the FakeSocket? 

Other than that, looks fine.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6790
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6972] zipfile.ZipFile overwrites files outside destination path

2009-09-28 Thread Thomas W. Barr

Changes by Thomas W. Barr t...@rice.edu:


--
nosy: +twb

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6972
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



<    1   2