On Sun, 2007-07-22 at 17:44 +0200, Peter Kleiweg wrote: > > It's a feature. See help(str.split): "If sep is not specified or is > > None, any whitespace string is a separator." > > Define "any whitespace".
Any string for which isspace returns True. > Why is it different in <type 'str'> and <type 'unicode'>? >>> '\xa0'.isspace() False >>> u'\xa0'.isspace() True For byte strings, Python doesn't know whether 0xA0 is a whitespace because it depends on the encoding whether the number 160 corresponds to a whitespace character. For unicode strings, code point 160 is unquestionably a whitespace, because it is a no-break SPACE. > Why does split() split when it says NO-BREAK? Precisely. It says NO-BREAK. It doesn't say NO-SPLIT. -- Carsten Haese http://informixdb.sourceforge.net -- http://mail.python.org/mailman/listinfo/python-list