Re: how to avoid leading white spaces

Chris Torek Fri, 03 Jun 2011 14:53:24 -0700

>On 2011-06-03, ru...@yahoo.com <ru...@yahoo.com> wrote:
[prefers]
>>     re.split ('[ ,]', source)


This is probably not what you want in dealing with
human-created text:

    >>> re.split('[ ,]', 'foo bar, spam,maps')
    ['foo', '', 'bar', '', 'spam', 'maps']

Instead, you probably want "a comma followed by zero or
more spaces; or, one or more spaces":

    >>> re.split(r',\s*|\s+', 'foo bar, spam,maps')
    ['foo', 'bar', 'spam', 'maps']

or perhaps (depending on how you want to treat multiple
adjacent commas) even this:

    >>> re.split(r',+\s*|\s+', 'foo bar, spam,maps,, eggs')
    ['foo', 'bar', 'spam', 'maps', 'eggs']

although eventually you might want to just give in and use the
csv module. :-)  (Especially if you want to be able to quote
commas, for instance.)

>> ...  With regexes the code is likely to be less brittle than a
>> dozen or more lines of mixed string functions, indexes, and
>> conditionals.

In article <94svm4fe7...@mid.individual.net>
Neil Cerutti  <ne...@norwich.edu> wrote:
[lots of snippage]
>That is the opposite of my experience, but YMMV.

I suspect it depends on how familiar the user is with regular
expressions, their abilities, and their limitations.

People relatively new to REs always seem to want to use them
to count (to balance parentheses, for instance).  People who
have gone through the compiler course know better. :-)
-- 
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to avoid leading white spaces

Reply via email to