On Oct 20, 2008, at 09:48, John Machin wrote:
Based on my experience extracting data from innumerable csv files
(and infinite varieties thereof), spreadsheet files, and database
tables, in 99.99% of cases one should automatically apply the
following transformations to each text field:
* strip leading whitespace
* strip trailing whitespace
* replace embedded runs of whitespace by a single space
and one needs to ensure that the definition of whitespace includes
the no-break space (NBSP) character.
As this "space normalisation" is needed for all input sources, the
csv module is IMHO the wrong place to put it. A string method would
be a better idea.
Hm. It seems quite familiar, somehow...
You could certainly do the following (for each field)...
" ".join(field.split())
... but I seem to recall running across something that did this?
(Maybe I'm confusing it with some other issue, with the
string.capwords function versis str.title :)
--
Magnus Lie Hetland
http://hetland.org
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com