On Mon, Jan 21, 2002 at 05:45:55PM +0100, [EMAIL PROTECTED] wrote:
> On Mon, Jan 21, 2002 at 03:48:58PM +0000, Robin Houston wrote:
> > On Mon, Jan 21, 2002 at 04:32:59PM +0100, [EMAIL PROTECTED] wrote:
> > > You are right, I had forgotten a case, [...]
> > > This results in (a sloooow program):
> >
> > There's something wrong here. Your regex matches "abcb", which isn't
> > shrinkable.
>
>
> Duh! I was writing (...)+ where I should have written (...)\1*.
>
> Here's a corrected version (making for quite a faster regex):
On reflection, it appears my second case is a special case of the third.
This makes for a smaller regex:
#!/usr/bin/perl
use strict;
use warnings qw /all/;
my @strings;
my $p;
my $max = 255; # Use a higher number for Unicode.
foreach my $c (1 .. $max) {
# Strings of the form: XbYXaZ, a lt b.
push @strings => sprintf '(.*)\x%02x.*\%d[\x00-\x%02x].*' =>
$c, ++ $p, $c - 1;
# Strings of the form: (XaY)+XbZXaY, a lt b.
push @strings => sprintf '((.*)[\x00-\x%02x].*)\%d*\%d\x%02x.*\%d' =>
$c - 1, $p + 1, $p + 2, $c, $p + 1;
$p += 2;
}
my $regex = join "|\n " => map {"(?:$_)"} @strings;
$regex = "^(?:$regex)\$";
print "/$regex/sx\n";
__END__
Abigail