http://bugzilla.spamassassin.org/show_bug.cgi?id=4469





------- Additional Comments From [EMAIL PROTECTED]  2005-07-08 10:40 -------
Subject: Re:  Add a process/option to efficiently deal with very long mail 
messages 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> In the original SA3 code, BTW, everything was a temp file.  Since that
> seemed overly complicated since each part can have multiple versions,
> etc, it was converted to the "all in memory" version.

it's also slower. the qpsmtpd algorithm is nice, both for speed and RAM:
it goes like this:

  my $buffer;
  my $tmpfile_handle;       # closed and unset
  my $tmpfile_open = 0;
  while (reading) {
    if (size > some_limit) {
      if (!$tmpfile_open) {
        $tmpfile_open = 1;
        # generate tmpfile name
        # open tmpfile, if not already open
      }
      # write to $tmpfile_handle
    }
    else {
      # add to buffer
    }
  }

so the benefit is that the buffer contains the text part we're prepared to
scan, and the tmpfile is only ever opened (and disk I/O incurred) for
massive mails.

> > This would allow us to scan even 100MB mails without breaking a sweat and
> > causing all those FAQs on the users list. ;)
> 
> Well, yes and no.  There's still the hit of storing the message in memory,
> at least once, when it's initially read in.  We could store the pristine
> body in a temp file, but then any full rules or the rewrite at the end
> will cause that to come back in.

full rules: change the semantics to only match the first 250k of the
message data

rewrite: add a new iterator interface as well as the old all-in-RAM
interface

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCzro2MJF5cimLx9ARAhv+AJ9KvZcVbkPlBKOGmo7wIRrFIzgWsACgmCXT
mEDzMudMpTcoZwDKkkrzjJc=
=Mf8Z
-----END PGP SIGNATURE-----





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to