On Wed, 18 Jun 2003 21:31:21 +0200, allan juul wrote:
>while (<FILE>) {
> s/<(word1|word2|etc|word12)>/<newword="$1"/g;
> s/</(word1|word2|etc|word12)>/</<newword>/g;
> print NEWFILE;
>}
>
>
>this works fine but is very very slow am sure because of the alternation in the
>reg exp.
>anyway, does anyone have a better/faster approach
The CPAN module Regex::PreSuf, see
<http://search.cpan.org/author/JHI/Regex-PreSuf-1.15/PreSuf.pm>, can
serve to generate a way to produce a, hopefully faster, alternative for
your regex. It's worth a try. First you generate a string containing the
regex from the wordlist:
use Regex::PreSuf;
my $re = presuf(qw(word1 word2 etc word12));
and next you use it in your substitutions like this:
s/<($re)>/<newword="$1">/og;
s/<\/$re>/<\/newword>/og;
You might get some extra gain by adding lookahead to your alternatives:
/(?=[fb])(foo|bar)/
might be faster than plain
/(foo|bar)/
Maybe.
HTH,
Bart.