"Tim Chase" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> >> >>> r = re.compile(r'(?:\([^\)]*\)|\[[^\]]*\]|\S)+')
> >> >>> r.findall(s)
> >>['(a c)b(c d)', 'e']
> >
> > Ah, it's exactly what I want!  I thought the left and right
> > sides of "|" are equal, but it is not true.
>
> In theory, they *should* be equal. I was baffled by the nonparity
> of the situation.  You *should" be able to swap the two sides of
> the "|" and have it treated the same.  Yet, when I tried it with
> the above regexp, putting the \S first, it seemed to choke and
> give different results.  I'd love to know why.
>
Does the re do left-to-right matching?  If so, then the \S will eat the
opening parens/brackets, and never get into the other alternative patterns.
\S is the most "matchable" pattern, so if it comes ahead of the other
alternatives, then it will always be the one matched.  My guess is that if
you put \S first, you will only get the contiguous character groups,
regardless of ()'s and []'s.  The expression might as well just be \S+.

Or I could be completely wrong...

-- Paul


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to