http://bugzilla.spamassassin.org/show_bug.cgi?id=3876
Summary: Bayes tokenizer method creates space wasting hash
Product: Spamassassin
Version: 3.0.0
Platform: Other
OS/Version: other
Status: NEW
Severity: normal
Priority: P5
Component: Learner
AssignedTo: [email protected]
ReportedBy: [EMAIL PROTECTED]
The hash that the tokenizer method creates wastes a large amount of memory and
isn't necessary. Currently it creates hash entries that look like this:
$tokens{$key} = { 'raw_token' => $token }
the hash value might be added to later in the scan process, but there is a much
better way to do that piece.
I propose the hash be changed to:
$tokens{$key} = $token
and then expand the %pw hash used in scan.
This also means that we need to slightly change what is passed into the Plugin
hooks (bayes_scan, bayes_forget, bayes_learn) for the new hash and probably want
to include the %pw hash in the bayes_scan call.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.