On Wed, 21 Jul 2004 19:11:01 +0200, Jonas Eckerman wrote:

Replying to myself...
 
>  Have you tried using "Regexp::List", "Regexp::Optimizer" or

Just found another one that might be worth checking out:
Perl-compatible regular expression optimizer
http://bisqwit.iki.fi/source/regexopt.html

It's a simple command line program that optimizes regular expressions. A test:

/\bhomeg(?:ain\.com|ain\.biz|ain\.net|un\.com)\b/i
became
/bhomeg(?:(?:ai|u)n\.com|ain\.(?:biz|net))b/i

/\bhomel(?:oanace\.com|andunited\.com|anddefensejournal\.com|anddefenseradio\.com|andsecurityresearch\.com|ead\.net|essprelates\.com|essteens\.com)\b/i
became
/bhomel(?:ead\.net|(?:oanace|ess(?:prelate|teen)s|an(?:d(?:united|securityresearch)|ddefense(?:journal|radio)))\.com)b/i

Maybe worth trying a script using this...

If I have more time, I might try feeding the expressions currently in bigevil 
thorugh this one and "Regexp::Optimizer" to see what they might give me.

What does the source data for bigevil look like? Is it just one long list of 
domains?

Regards
/Jonas

--
Jonas Eckerman, [EMAIL PROTECTED]
http://www.fsdb.org/

Reply via email to