> -----Original Message----- > From: Ben Siders [mailto:[EMAIL PROTECTED]] > Sent: Thursday, January 16, 2003 7:38 AM > To: Perl > Subject: Escaping Ampersands in XML > > > I've got a real easy one here (in theory). I have some XML > files that > were generated by a program, but generated imperfectly. There's some > naked ampersands that need to be converted to &. I need a regexp > that will detect them and change them. Sounds easy enough. > > The pattern I want to match is an ampersand that is NOT immediately > followed by a few characters and then a semicolon. Any ideas? > > This is the best I've come up with so far. It should match > an ampersand > whose following characters, up to five, are not semicolons. I don't > feel that this is a great solution. I'm hoping the community > can think > of a better one. > > $line =~ s/\&[^;]{,5}/\&/g;
Try this one: s/&(?!\w+;)/&/g > > I'm hoping that'll match something like: "<tag>Blah data > &</tag>", but > NOT match "<tag>Blah &</tag>". > > I'm not sure if I'm on the right track here. I also can't > match other > escaped characters such as: "<tag>Copyright © 2003</tag>". > > > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]