Re: [Mimedefang] Please review: new Spamc feature

2005-10-26 Thread David F. Skoll
Richard Laager wrote:

 Use the embedded Perl feature of MIMEDefang and use compile_now() from
 SpamAssassin. That way, the SpamAssassin initialization is done once.
 fork() on Linux (and Unix in general, I believe) is very lightweight.
 The SpamAssassin stuff in memory will be shared by all the threads.

That's the theory.  Unfortunately, reality is more brutal.

Perl uses reference-counting garbage collection.  Which means that if
a process even looks at a chunk of data, the reference count gets
incremented.  Which touches the memory page.  Which means no sharing.

Embedded Perl helps a bit.  But unfortunately, it doesn't help anywhere
near as much as it should help because of Perl's #*$*#$ reference-counting.

Regards,

David.
___
Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list
MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


RE: [Mimedefang] Please review: new Spamc feature

2005-10-26 Thread Matthew.van.Eerde
Matthew van Eerde wrote:
   unshift (@msg, @sahdrs);

Hmm... I think I need to add a comment here...

# Note the new headers end up *before* the pre-existing headers.
# This is important for thinks like DomainKeys.
unshift (@msg, @sahdrs);

-- 
Matthew.van.Eerde (at) hbinc.com   805.964.4554 x902
Hispanic Business Inc./HireDiversity.com   Software Engineer

___
Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list
MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


[Mimedefang] Please review: new Spamc feature

2005-10-25 Thread Matthew.van.Eerde
I've broken down and coded a mimedefang-filter that calls spamc instead of 
use'ing Mail::SpamAssassin.

I'd ideally like to post this on the wiki for those who might find it useful... 
but I'm interested in feedback first. Can you glance over the code and tell me 
what you think?

The idea is to trim down the MIMEDefang threads to be lightweight, so I can 
make a whole lot of them.  I do all sorts of things w/ MIMEDefang besides 
spam-scan, and while the MIMEDefang threads are doing all these things, that 
SpamAssassin module is sitting there idle, but taking up space.  I do have a 
performance hit from spawning the spamc process, but I thought I'd experiment 
to see if the tradeoff is a net benefit or loss.

This requires a working spamd setup and a custom spamd report template.

Modifications to /etc/mail/spamassassin/local.cf:
# This changes the default report template to one which is easier to parse
# but not necessarily easier to read
# only the first five report lines are used by MIMEDefang...
# the rest can contain custom flavor text
# note the characters after each colon must be a TAB (\t)
clear_report_template
report Score:   _SCORE()_
report Required:_REQD_
report Tests:   _TESTS(,)_
report
report _SUMMARY_

Modifications to mimedefang-filter:

# Outside of any function, but
# before detect_and_load_perl_modules();

$Features{SpamAssassin} = 0; # false but defined
$Features{Spamc} = 1; # new feature

# this function creates a spamc-scannable version of the message
# this is basically INPUTMSG with some extra headers
# the headers imitate what sendmail will eventually add anyway
sub create_spamc_scan_file()
{
# code liberally stolen from
# mimedefang.pl's spam_assassin_mail
open(IN, ./INPUTMSG) or return undef;
my @msg = IN;
close(IN);

# Synthesize a Return-Path and Received: header
my @sahdrs;
push (@sahdrs, Return-Path: $Sender\n);
push (@sahdrs, split(/^/m, synthesize_received_header()));
push (@sahdrs, gen_msgid_header()) if ($MessageID eq NOQUEUE);

unshift (@msg, @sahdrs);

open(FORSPAMC, ./FORSPAMC) or return undef;
print FORSPAMC @msg;
close(FORSPAMC);

return 1;
}

# in filter_end, instead of the
# if ($Features{SpamAssassin}) block:
if (
$Features{Spamc} and # is there spamc?
-s ./INPUTMSG  100 * 1024 and # don't scan messages over 
100KB
create_spamc_scan_file() # create a spamc-scannable file
)
{

my $forcespamreport = 0;

# this if() is another custom feature, unrelated to spamc
# if this is to [EMAIL PROTECTED]
# and ONLY to [EMAIL PROTECTED]
# then force a spam report even if the message isn't spam
if (
@Recipients == 1 and
$Recipients[0] =~ /^[EMAIL PROTECTED]?$/i
)
{
$forcespamreport = 1;
}

# spamc options
# -r shows a report only if it's spam
# -R shows a report whether it's spam or not
my $r = ($forcespamreport ? -R : -r);

my $report = `spamc $r  ./FORSPAMC`;
unlink('./FORSPAMC'); 

if ($report eq )
{
# not spam! nothing to do.
} elsif ( $report =~
/^
Score:  \t  ([\d\.]+?)  \n
Required:   \t  ([\d\.]+?)  \n
Tests:  \t  ([\w,]+?)   \n
\n
/x
)
{
my $score = $1;
my $required = $2;
my $tests = $3;
my $stars = * x ($score  40 ? int($score) : 40);


if ($forcespamreport)
{
action_add_part(
$entity,
text/plain,
-suggest,
$report . \n,
SpamAssassinReport.txt, inline
);
}

if ($score = $required)
{
action_change_header(
X-Spam-Score,
$stars .  ( . $score . )  . $tests
);

action_delete_all_headers(Subject);
action_add_header(
Subject,

Re: [Mimedefang] Please review: new Spamc feature

2005-10-25 Thread John Nemeth
On Mar 17,  5:37am, [EMAIL PROTECTED] wrote:
}
} I've broken down and coded a mimedefang-filter that calls spamc
} instead of use'ing Mail::SpamAssassin.
} 
} I'd ideally like to post this on the wiki for those who might find it
} useful... but I'm interested in feedback first. Can you glance over
} the code and tell me what you think?
} 
} The idea is to trim down the MIMEDefang threads to be lightweight, so
} I can make a whole lot of them.  I do all sorts of things w/
} MIMEDefang besides spam-scan, and while the MIMEDefang threads are
} doing all these things, that SpamAssassin module is sitting there
} idle, but taking up space.  I do have a performance hit from spawning
} the spamc process, but I thought I'd experiment to see if the
} tradeoff is a net benefit or loss.

 Why don't you create a function to call spamd directly, similar to
the way that MIMEDefang calls clamd?  That way, you won't have the
spamc process overhead?

}-- End of excerpt from [EMAIL PROTECTED]
___
Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list
MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] Please review: new Spamc feature

2005-10-25 Thread Richard Laager
On Tue, 2005-10-25 at 11:01 -0700, [EMAIL PROTECTED] wrote:
 I do all sorts of things w/ MIMEDefang besides spam-scan,
 and while the MIMEDefang threads are doing all these things,
 that SpamAssassin module is sitting there idle, but taking
 up space.

Use the embedded Perl feature of MIMEDefang and use compile_now() from
SpamAssassin. That way, the SpamAssassin initialization is done once.
fork() on Linux (and Unix in general, I believe) is very lightweight.
The SpamAssassin stuff in memory will be shared by all the threads.

I do this, and ... unless I'm very confused ;) ... it saves TONS of
memory.

Richard


___
Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list
MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang