http://bugzilla.spamassassin.org/show_bug.cgi?id=3876

           Summary: Bayes tokenizer method creates space wasting hash
           Product: Spamassassin
           Version: 3.0.0
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Learner
        AssignedTo: [email protected]
        ReportedBy: [EMAIL PROTECTED]


The hash that the tokenizer method creates wastes a large amount of memory and
isn't necessary.  Currently it creates hash entries that look like this:
$tokens{$key} = { 'raw_token' => $token }
the hash value might be added to later in the scan process, but there is a much
better way to do that piece.

I propose the hash be changed to:
$tokens{$key} = $token
and then expand the %pw hash used in scan.

This also means that we need to slightly change what is passed into the Plugin
hooks (bayes_scan, bayes_forget, bayes_learn) for the new hash and probably want
to include the %pw hash in the bayes_scan call.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to