Re: [Tutor] Regular expressions question

Albert-Jan Roskam Thu, 06 Dec 2012 01:55:28 -0800

_______________________________
>From: eryksun <eryk...@gmail.com>
>To: Ed Owens <eowens0...@gmx.com> 
>Cc: "tutor@python.org" <tutor@python.org> 
>Sent: Thursday, December 6, 2012 3:08 AM
>Subject: Re: [Tutor] Regular expressions question
>
>On Wed, Dec 5, 2012 at 7:13 PM, Ed Owens <eowens0...@gmx.com> wrote:
>>>>> str(string)
>> '[<div class="wx-timestamp">\n<div class="wx-subtitle wx-timestamp">Updated:
>> Dec 5, 2012, 5:08pm EST</div>\n</div>]'
>>>>> m = re.search('":\b(\w+\s+\d+,\s+\d+,\s+\d+:\d+.m\s+\w+)<', str(string))
>>>>> print m
>> None
>
>You need a raw string for the boundary marker \b (i.e the boundary
>between \w and \W), else it creates a backspace control character.
>Also, I don't see why you have ": at the start of the expression. This
>works:
>
>    >>> s = 'Updated: Dec 5, 2012, 5:08pm EST</div>'
>    >>> m = re.search(r'\b(\w+\s+\d+,\s+\d+,\s+\d+:\d+.m\s+\w+)<', s)
>    >>> m.group(1)
>    'Dec 5, 2012, 5:08pm EST'


Lately I started using named groups (after I didn't understand some of my own 
regexes I wrote several months earlier).
The downside is that the regexes easily get quite long, but one could use the 
re.VERBOSE flag to make it more readable.
m = re.search(r'\b(?P<date>\w+\s+\d+,\s+\d+,\s+\d+:\d+.m\s+\w+)<', s)
>>> m.group("date")
'Dec 5, 2012, 5:08pm EST'

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regular expressions question

Reply via email to