Re: A possibly suspect idea

2010-03-12 Thread Martin Gregorie
On Fri, 2010-03-12 at 08:15 +0200, Henrik K wrote: Why don't you simply maintain your wordlists in some files and use a script to generate portmanteau.cf? You could use Regexp::Assemble module to optimize also. Who cares what the actual rules look like? The more words (simple alternations)

Re: A possibly suspect idea

2010-03-12 Thread Bowie Bailey
Martin Gregorie wrote: On Fri, 2010-03-12 at 08:15 +0200, Henrik K wrote: Why don't you simply maintain your wordlists in some files and use a script to generate portmanteau.cf? You could use Regexp::Assemble module to optimize also. Who cares what the actual rules look like? The more

Re: A possibly suspect idea

2010-03-12 Thread Henrik K
On Fri, Mar 12, 2010 at 01:52:01PM +, Martin Gregorie wrote: On Fri, 2010-03-12 at 08:15 +0200, Henrik K wrote: Why don't you simply maintain your wordlists in some files and use a script to generate portmanteau.cf? You could use Regexp::Assemble module to optimize also. Who cares

Re: A possibly suspect idea

2010-03-12 Thread RW
On Thu, 11 Mar 2010 20:11:37 + Martin Gregorie mar...@gregorie.org wrote: - am I right about all regexes in a portmanteau rule being applied to every message? I would presume not and that meta-rules short-circuit the way that logical expressions do in perl. It shouldn't make much

Re: A possibly suspect idea

2010-03-12 Thread d . hill
Quoting Bowie Bailey bowie_bai...@buc.com: Martin Gregorie wrote: On Fri, 2010-03-12 at 08:15 +0200, Henrik K wrote: Why don't you simply maintain your wordlists in some files and use a script to generate portmanteau.cf? You could use Regexp::Assemble module to optimize also. Who cares what

Re: A possibly suspect idea

2010-03-12 Thread Martin Gregorie
On Fri, 2010-03-12 at 16:27 +0200, Henrik K wrote: If you have enough words to require multiple REs, then sorting doesn't hurt. So the start boundaries for a single RE to catch on are minimized. OK, so there are benefits if every alternate in a regex starts with the same letter? Almost

Re: A possibly suspect idea

2010-03-12 Thread Bowie Bailey
Martin Gregorie wrote: On Fri, 2010-03-12 at 16:27 +0200, Henrik K wrote: If you have enough words to require multiple REs, then sorting doesn't hurt. So the start boundaries for a single RE to catch on are minimized. OK, so there are benefits if every alternate in a regex starts

A possibly suspect idea

2010-03-11 Thread Martin Gregorie
Earlier today I mentioned that I have a number of portmanteau rules that fire on misspelt words in body text, etc. These are all structured along the lines of: describe PORTMANTEAU Example of a somewhat unwieldy rule body __PM1 /(word1|worrd2|wooord3|)/i body __PM2

Re: A possibly suspect idea

2010-03-11 Thread Henrik K
On Thu, Mar 11, 2010 at 08:11:37PM +, Martin Gregorie wrote: Earlier today I mentioned that I have a number of portmanteau rules that fire on misspelt words in body text, etc. These are all structured along the lines of: describe PORTMANTEAU Example of a somewhat unwieldy rule body