Bugs item #1116571, was opened at 2005-02-05 01:12 Message generated for change (Settings changed) made by effbot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1116571&group_id=5470
Category: Regular Expressions Group: Python 2.4 >Status: Closed >Resolution: Invalid Priority: 5 Submitted By: rengel (engel_re) Assigned to: Gustavo Niemeyer (niemeyer) Summary: Wrong match with regex, non-greedy problem Initial Comment: # This is executable. # My test string ist rather long: tst = "In this <c:noun:ns>Buch</c:noun>, used to designate <c:noun:np>Dinge der Wirklichkeit</c:noun> rather than <c:noun:fs>SW</c:noun> <c:noun:ns>Ent</c:noun>." # I want to match the last part of the string: # <c:noun:fs>SW</c:noun> <c:noun:ns>Ent</c:noun> # So I define the following pattern an compile it: pat = r"<c:noun:(.*?)>(.*?)</c:noun> <c:noun:(.*?)>(.*?)</c:noun>" rex = re.compile(pat) # Then I search the string to get a match group : mat = rex.search(tst) # If found, print the group if mat: print mat.group() # Instead of # <c:noun:fs>SW</c:noun> <c:noun:ns>Ent</c:noun> # I get the whole string starting with # <c:noun:ns>Buch</c:noun>... # up to the very last </c:noun> # Apparently the non-greedy operator doesn't work correctly. # What's wrong? ---------------------------------------------------------------------- Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:27 Message: Logged In: YES user_id=38376 Search returns the first (left-most) location where the pattern matches, if any. The non-greedy operator only guarantees that you get the shortest possible match at that location. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1116571&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com