I'm still shaky on some of sre's syntax. Here's the task: I've got strings (never longer than about a dozen characters) that are guaranteed to be made only of characters 'x' and '/'. In each string I want to find the longest continuous stretch of pairs whose first character is 'x' and the second is either mark. So if marks = '/xx/xxx///', the "winning" stretch begins at position 2 and is 6 characters long ('x/xxx/'), which requires finding a second match that overlaps the first match (which is just 'xx' in position 1). (When there are multiple "winning" stretches, I might want to adjudicate among them, but that's a separate problem.) I hope this is clear enough.
Here's the best I've come up with so far:
pat = sre.compile('(x[x/])+')
(longest, startlongest) = max([(fnd.end()-fnd.start(), fnd.start()) for i in range(len(marks))
for fnd in pat.finditer(marks,i)])
It's pretty simple to put re.search() into a loop where subsequent searches start from the character after where the previous one matched. Here is a solution that uses a general-purpose longest match function:
import re
# RE solution
def longestMatch(rx, s):
''' Find the longest match for rx in s.
Returns (start, length) for the match or (None, None) if no match found.
'''start = length = current = 0
while True:
m = rx.search(s, current)
if not m:
break mStart, mEnd = m.span()
current = mStart + 1 if (mEnd - mStart) > length:
start = mStart
length = mEnd - mStart if length:
return start, lengthreturn None, None
pairsRe = re.compile(r'(x[x/])+')
for s in [ '/xx/xxx///', '//////xx//' ]:
print s, longestMatch(pairsRe, s)
--
http://mail.python.org/mailman/listinfo/python-list
