https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6667
Bug #: 6667
Summary: bayes_use_hapaxes and and dubious claim about database
size
Product: Spamassassin
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Documentation
AssignedTo: [email protected]
ReportedBy: [email protected]
Classification: Unclassified
In the Mail::SpamAssassin::Conf documentation we have
"bayes_use_hapaxes (default: 1)
Should the Bayesian classifier use hapaxes (words/tokens that occur only once)
when classifying? This produces significantly better hit-rates, but increases
database size by about a factor of 8 to 10."
Unless someone can come up with a good reason why the claim about database size
is true, I would suggest it be removed.
--
Configure bugmail:
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.