http://bugzilla.spamassassin.org/show_bug.cgi?id=4469
------- Additional Comments From [EMAIL PROTECTED] 2005-07-08 10:40 -------
Subject: Re: Add a process/option to efficiently deal with very long mail
messages
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
> In the original SA3 code, BTW, everything was a temp file. Since that
> seemed overly complicated since each part can have multiple versions,
> etc, it was converted to the "all in memory" version.
it's also slower. the qpsmtpd algorithm is nice, both for speed and RAM:
it goes like this:
my $buffer;
my $tmpfile_handle; # closed and unset
my $tmpfile_open = 0;
while (reading) {
if (size > some_limit) {
if (!$tmpfile_open) {
$tmpfile_open = 1;
# generate tmpfile name
# open tmpfile, if not already open
}
# write to $tmpfile_handle
}
else {
# add to buffer
}
}
so the benefit is that the buffer contains the text part we're prepared to
scan, and the tmpfile is only ever opened (and disk I/O incurred) for
massive mails.
> > This would allow us to scan even 100MB mails without breaking a sweat and
> > causing all those FAQs on the users list. ;)
>
> Well, yes and no. There's still the hit of storing the message in memory,
> at least once, when it's initially read in. We could store the pristine
> body in a temp file, but then any full rules or the rewrite at the end
> will cause that to come back in.
full rules: change the semantics to only match the first 250k of the
message data
rewrite: add a new iterator interface as well as the old all-in-RAM
interface
- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS
iD8DBQFCzro2MJF5cimLx9ARAhv+AJ9KvZcVbkPlBKOGmo7wIRrFIzgWsACgmCXT
mEDzMudMpTcoZwDKkkrzjJc=
=Mf8Z
-----END PGP SIGNATURE-----
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.