In article <[EMAIL PROTECTED]>, "Russell Blau" <[EMAIL PROTECTED]> wrote:
> "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > David C. Ullrich schrieb: > >> Say I want to replace 'disc' with 'disk', but only > >> when 'disc' is a complete word (don't want to change > >> 'discuss' to 'diskuss'.) The following seems almost > >> right: > >> > >> [^a-zA-Z])disc[^a-zA-Z] > >> > >> The problem is that that doesn't match if 'disc' is at > >> the start or end of the string. Of course I could just > >> combine a few re's with |, but it seems like there should > >> (or might?) be a way to simply append a \A to the first > >> [^a-zA-Z] and a \Z to the second. > > > > Why not > > > > ($|[\w])disc(^|[^\w]) > > > > I hope \w is really the literal for whitespace - might be something > > different, see the docs. > > No, \s is the literal for whitespace. > http://www.python.org/doc/current/lib/re-syntax.html > > But how about: > > text = re.sub(r"\bdisc\b", "disk", text_to_be_changed) > > \b is the "word break" character, Lovely - that's exactly right, thanks. I swear I looked at the docs... I'm just blind or stupid. No wait, I'm blind _and_ stupid. No, blind and stupid and slow... Doesn't precisely fit the _spec_ because of digits and underscores, but it's close enough to solve the problem exactly. Thanks. >it matches at the beginning or end of any > "word" (where a word is any sequence of \w characters, and \w is any > alphanumeric > character or _). > > Note that this solution still doesn't catch "Disc" if it is capitalized. Thanks. I didn't mention I wanted to catch both cases because I already knew how to take care of that: r"\b[dD]isc\b" > Russ -- David C. Ullrich -- http://mail.python.org/mailman/listinfo/python-list