From: "Keith C. Ivey" <[EMAIL PROTECTED]>

> Paul Barbeau <[EMAIL PROTECTED]> wrote:
>
> > rawbody     HN_WORDWORD10
> > /(?:\b(?!=:q(?:from|even|more|this|that|were|with)\b)[a-z]{4,12}
> > [.,:;'!?-]?\ s+){10}/ describe    HN_WORDWORD10  LOCAL: string
> > of 10+ random words score       HN_WORDWORD10  .5
>
> What's that ":q" doing in there?  Looks like something got
> garbled somewhere along the line.
>
> Also (and this affects the other rules you posted), the syntax
> for a negative lookahead is "(?!whatever)", not
> "(?!=whatever)".  The extraneous equals sign means you're
> really excluding those short words only if they've got an
> equals sign in front of them (or in this case a "=:q"), but
> that can't happen anyway because the main match requires a
> lowercase letter in that position, so the negative lookahead is
> doing nothing.

FWIW this is two working versions of that rule expanded into a set of
three rules with escallating scores:

# match Bayes-poison lists of lowercase words without articles or common
prepositions

body  PT_WORDLIST_10
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z]{4,12}\s+){10}/
describe PT_WORDLIST_10        string of 10+ random words
score  PT_WORDLIST_10  1.0

body  PT_WORDLIST_13
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z]{4,12}\s+){13}/
describe PT_WORDLIST_13        string of 13+ random words
score  PT_WORDLIST_13  3.0

body  PT_WORDLIST_30
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z]{4,12}\s+){30}/
describe PT_WORDLIST_30        string of 30+ random words
score  PT_WORDLIST_30  10.0

# match Bayes-poison lists of lowercase words without articles or common
prepositions ignoring punctuation.

body  XX_WORDLIST_10
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z\.\,\-\;]{4,18}\s+){10}/
describe XX_WORDLIST_10 string of 10+ random words
score  XX_WORDLIST_10   1.0

body  XX_WORDLIST_13
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z\.\,\-\;]{4,18}\s+){13}/
describe XX_WORDLIST_13 string of 13+ random words
score  XX_WORDLIST_13   3.0

body  XX_WORDLIST_30
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z\.\,\-\;]{4,18}\s+){30}/
describe XX_WORDLIST_30 string of 30+ random words
score  XX_WORDLIST_30   10.0

Loren put them together.
{^_^}

Reply via email to