On 5/22/14 6:48 PM, Karsten Bräckelmann guent...@rudersport.de wrote:
On Thu, 2014-05-22 at 18:34 -0500, David B Funk wrote:
After doing some experimenting with that code I came up with something
that
I'd argue is more semantically correct:
# if we've got a long series of blank lines,
On Thu, 2014-05-22 at 03:12 +0200, Karsten Bräckelmann wrote:
In either case, having a sample would speed up this ping-pong style
debugging. And I am curious. ;) Mind putting your sample up a pastebin?
Ian sent me the original message off-list. It indeed contains about 16
consecutive newlines,
On Thu, 22 May 2014, Karsten Bräckelmann wrote:
On Thu, 2014-05-22 at 03:12 +0200, Karsten Bräckelmann wrote:
[snip..]
The number of continuation lines equals the number of newlines in the
test-case.
Well, up until 12, that is. :-/
Any number up to 11 of consecutive newlines can be matched
On Thu, 22 May 2014, David B Funk wrote:
On Thu, 22 May 2014, Karsten Bräckelmann wrote:
On Thu, 2014-05-22 at 03:12 +0200, Karsten Bräckelmann wrote:
[snip..]
The number of continuation lines equals the number of newlines in the
test-case.
Well, up until 12, that is. :-/
Any number up to
On Thu, 2014-05-22 at 18:34 -0500, David B Funk wrote:
After doing some experimenting with that code I came up with something that
I'd argue is more semantically correct:
# if we've got a long series of blank lines, limit them
if (defined $start) {
my $max_blank_lines = 20;
On Thu, 15 May 2014 12:18:25 -0800
Kevin Miller kevin_mil...@ci.juneau.ak.us wrote:
I implemented a rule that looks for multiple breaks for just that
reason. Can't remember where I stole it from - probably some folks
here helped me with it a few years ago. Can't remember who, but
On Wed, 2014-05-21 at 10:23 -0700, Ian Zimmerman wrote:
I am trying to do a variant of this for text/plain, as that is the type
I mostly face now. But I cannot get it to work.
header __LOCAL_PLAIN_ASCII Content-Type =~ /text\/plain; *charset=us-ascii/i
rawbody __LOCAL_MUCHO_BLANKS
On Wed, 21 May 2014 19:08:51 +0100
Martin Gregorie mar...@gregorie.org wrote:
rawbody __LOCAL_MUCHO_BLANKS /\n{10,}/m
Martin Looking for newlines rather than whitespace? Does /\s{10,}/m
Martin work any better?
Nope, it doesn't :-( Anyway, looking for newlines was my intention,
sorry for the
On Wed, 21 May 2014, Ian Zimmerman wrote:
On Wed, 21 May 2014 19:08:51 +0100
Martin Gregorie mar...@gregorie.org wrote:
rawbody __LOCAL_MUCHO_BLANKS /\n{10,}/m
Martin Looking for newlines rather than whitespace? Does /\s{10,}/m
Martin work any better?
Nope, it doesn't :-( Anyway, looking
On Wed, 2014-05-21 at 10:23 -0700, Ian Zimmerman wrote:
I am trying to do a variant of this for text/plain, as that is the type
I mostly face now. But I cannot get it to work.
rawbody __LOCAL_MUCHO_BLANKS /\n{10,}/m
You don't need the or more quantifier at the end of your RE. That just
On Wed, 21 May 2014 22:26:41 +0200
Karsten Bräckelmann guent...@rudersport.de wrote:
Karsten Seriously, the above rule, the shorter /\n{10}/, as well as the
Karsten variant posted by John without quantifier do exactly what you
Karsten asked for. They match 10 consecutive \n newline chars in the
On Wed, 2014-05-21 at 17:32 -0700, Ian Zimmerman wrote:
The test message does not have that string. Maybe it uses DOS
flavor \r\n. Or what appears to be a bunch of linebreaks
actually has spaces mixed in.
Well, no. I looked at the message (the same data I fed to s.a. --debug)
with
On Fri, 16 May 2014 21:36:22 -0600
Bob Proulx wrote:
David Jones wrote:
James B. Byrne wrote:
If you keep Bayes well trained (assuming you have enough ham to
do so) Bayes poisoning is a myth.
I'm not sure I agree with the myth statement. I just had to
reset my Bayes DB after
On 5/14/2014 5:08 PM, James B. Byrne wrote:
Is there any way to limit Bayes content checking to only the first X
characters of the message body? I ask this because it is clear that the spam
messages getting through contain text meant to poison the tests but this
gibberish always trails the main
On Wed, 14 May 2014, James B. Byrne wrote:
Is there any way to limit Bayes content checking to only the first X
characters of the message body? I ask this because it is clear that the spam
messages getting through contain text meant to poison the tests but this
gibberish always trails the main
On Fri, 16 May 2014 07:22:56 -0400
David F. Skoll d...@roaringpenguin.com wrote:
James Is there any way to limit Bayes content checking to only the
James first X characters of the message body? I ask this because it is
James clear that the spam messages getting through contain text meant
James
I implemented a rule that looks for multiple breaks for just that reason.
Can't remember where I stole it from - probably some folks here helped me
with it a few years ago. Can't remember who, but appreciated the assistance.
On 5/16/2014 2:24 PM, Ian Zimmerman wrote:
On Fri, 16 May 2014 07:22:56 -0400
David F. Skoll d...@roaringpenguin.com wrote:
James Is there any way to limit Bayes content checking to only the
James first X characters of the message body? I ask this because it is
James clear that the spam
On Fri, 16 May 2014 11:24:29 -0700
Ian Zimmerman i...@buug.org wrote:
On close inspection, I see that the hash-busting garbage appended is
(faux) technical computing talk instead of the usual cookbooks or
classical literature :-p That is, scrambled Stack Overflow
discussions and the like.
On 05/14/2014 11:08 PM, James B. Byrne wrote:
Is there any way to limit Bayes content checking to only the first X
characters of the message body? I ask this because it is clear that the spam
messages getting through contain text meant to poison the tests but this
gibberish always trails the
On 05/14/2014 11:08 PM, James B. Byrne wrote:
Is there any way to limit Bayes content checking to only the first X
characters of the message body? I ask this because it is clear that the spam
messages getting through contain text meant to poison the tests but this
gibberish always trails
On Fri, 2014-05-16 at 11:24 -0700, Ian Zimmerman wrote:
In the last few (~10) days, I have seen a marked increase in FNs,
usually with Bayes values in the 50s and 60s.
That's a neutral bayes classification. Other rules should be able to
still identify the spam.
On close inspection, I see that
On Fri, 16 May 2014 16:20:21 -0400
Bowie Bailey bowie_bai...@buc.com wrote:
Keep in mind that BAYES_50 and BAYES_60 still contribute positive
scores by default. Though it is technically a neutral result, it
still adds a point or two to the score.
Rather than messing with Bayes, I would
David Jones wrote:
James B. Byrne wrote:
If you keep Bayes well trained (assuming you have enough ham to do so)
Bayes poisoning is a myth.
I'm not sure I agree with the myth statement. I just had to reset my Bayes
DB after years of it slowly drifting due to bad user input and such.
On Wed, 14 May 2014 17:08:26 -0400
James B. Byrne byrn...@harte-lyne.ca wrote:
Is there any way to limit Bayes content checking to only the first X
characters of the message body? I ask this because it is clear that
the spam messages getting through contain text meant to poison the
tests but
25 matches
Mail list logo