Re: re

David C. Ullrich Wed, 04 Jun 2008 14:43:12 -0700

In article <[EMAIL PROTECTED]>,
 "Russell Blau" <[EMAIL PROTECTED]> wrote:


> "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
> > David C. Ullrich schrieb:
> >> Say I want to replace 'disc' with 'disk', but only
> >> when 'disc' is a complete word (don't want to change
> >> 'discuss' to 'diskuss'.) The following seems almost
> >> right:
> >>
> >>   [^a-zA-Z])disc[^a-zA-Z]
> >>
> >> The problem is that that doesn't match if 'disc' is at
> >> the start or end of the string. Of course I could just
> >> combine a few re's with |, but it seems like there should
> >> (or might?) be a way to simply append a \A to the first
> >> [^a-zA-Z] and a \Z to the second.
> >
> > Why not
> >
> > ($|[\w])disc(^|[^\w])
> >
> > I hope \w is really the literal for whitespace - might be something 
> > different, see the docs.
> 
> No, \s is the literal for whitespace. 
> http://www.python.org/doc/current/lib/re-syntax.html
> 
> But how about:
> 
> text = re.sub(r"\bdisc\b", "disk", text_to_be_changed)
> 
> \b is the "word break" character, 

Lovely - that's exactly right, thanks. I swear I looked at the
docs... I'm just blind or stupid. No wait, I'm blind _and_
stupid. No, blind and stupid and slow...

Doesn't precisely fit the _spec_ because of digits and underscores,
but it's close enough to solve the problem exactly. Thanks.

>it matches at the beginning or end of any 
> "word" (where a word is any sequence of \w characters, and \w is any 
> alphanumeric
> character or _).
> 
> Note that this solution still doesn't catch "Disc" if it is capitalized.

Thanks. I didn't mention I wanted to catch both cases because I
already knew how to take care of that:

r"\b[dD]isc\b"

> Russ

-- 
David C. Ullrich
--
http://mail.python.org/mailman/listinfo/python-list

Re: re

Reply via email to