Re: [Tutor] Question regular expressions - the non-greedy pattern

Walter Prins Mon, 21 Jan 2013 08:25:58 -0800

Hi,



On 21 January 2013 14:45, Marcin Mleczko <marcin.mlec...@onet.eu> wrote:

> Did I get the concept of non-greedy wrong or is this really a bug?


Hugo's already explained the essence of your problem, but just to
add/reiterate:

a) match() will match at the beginning of the string (first character) or
not at all.  As specified your regex does in fact match from the first
character as shown so the result is correct.  (Aside, "<html>" in "<<html>"
does not in fact match *from the beginning of the string* so is besides the
point for the match() call.)

b) Changing your regexp so that the body of the tag *cannot* contain "<",
and then using search() instead, will fix your specific case for you:

import re

s = '<<html><head><title>Title</title>'
tag_regex = '<[^<]*?>'

matchobj = re.match(tag_regex, s)
print "re.match() result:", matchobj # prints None since no match at start
of s

matchobj = re.search(tag_regex, s)
# prints something since regex matches at index 1 of string
print "re.search() result:\n",
print "span:", matchobj.span()
print "group:", matchobj.group()


Walter

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Question regular expressions - the non-greedy pattern

Reply via email to