In article <[EMAIL PROTECTED]>, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> David C. Ullrich schrieb: > > Actually using regular expressions for the first > > time. Is there something that allows you to take the > > union of two character sets, or append a character to > > a character set? > > > > Say I want to replace 'disc' with 'disk', but only > > when 'disc' is a complete word (don't want to change > > 'discuss' to 'diskuss'.) The following seems almost > > right: > > > > [^a-zA-Z])disc[^a-zA-Z] > > > > The problem is that that doesn't match if 'disc' is at > > the start or end of the string. Of course I could just > > combine a few re's with |, but it seems like there should > > (or might?) be a way to simply append a \A to the first > > [^a-zA-Z] and a \Z to the second. > > Why not > > ($|[\w])disc(^|[^\w]) > > I hope \w is really the literal for whitespace - might be something > different, see the docs. Thanks, but I don't follow that at all. Whitespace is actually \s. But [\s]disc[whatever] doesn't do the job - then it won't match "(disc)", which counts as "disc appearing as a full word. Also I think you have ^ and $ backwards, and there's a ^ I don't understand. I _think_ that a correct version of what you're suggesting would be (^|[^a-zA-Z])disc($|[^a-zA-Z]) But as far as I can see that simply doesn't work. I haven't been able to use | that way, combining _parts_ of a re. That was the first thing I tried. The original works right except for not matching at the start or end of a string, the thing with the | doesn't work at all: >>> test = compile(r'(^|[^a-zA-Z])disc($|[^a-zA-Z])') >>> test.findall('') [] >>> test.findall('disc') [('', '')] >>> test.findall(' disc ') [(' ', ' ')] >>> disc = compile(r'[^a-zA-Z]disc[^a-zA-Z]') >>> disc.findall(' disc disc disc') [' disc '] >>> disc.findall(' disc disc disc') [' disc ', ' disc '] >>> test.findall(' disc disc disc') [(' ', ' '), (' ', ' ')] >>> disc.findall(' disc disc disc') [' disc ', ' disc '] >>> disc.findall(' disc disc disc ') [' disc ', ' disc ', ' disc '] > Diez -- David C. Ullrich -- http://mail.python.org/mailman/listinfo/python-list