https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7674
Bug ID: 7674
Summary: sa-learn learns all messages as ham even if --spam is
specified
Product: Spamassassin
Version: 3.4.2
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Learner
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: Undefined
While learning messages with "sa-learn --spam" from a folder the messages are
in fact learned as ham instead of spam.
Debug log:
Jan 1 22:42:12.185 [19522] dbg: bayes: learner_new
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x56193c244a38),
bayes_store_module=Mail::SpamAssassin::BayesStore::SQL
Jan 1 22:42:12.204 [19522] dbg: bayes: using username: XXX
Jan 1 22:42:12.204 [19522] dbg: bayes: learner_new: got
store=Mail::SpamAssassin::BayesStore::SQL=HASH(0x56193cd9a998)
Jan 1 22:42:12.217 [19522] dbg: bayes: database connection established
Jan 1 22:42:12.218 [19522] dbg: bayes: found bayes db version 3
Jan 1 22:42:12.218 [19522] dbg: bayes: Using userid: 4
Jan 1 22:42:12.219 [19522] dbg: bayes: not available for scanning, only 0
spam(s) in bayes DB < 200
Jan 1 22:42:12.221 [19522] dbg: sa-learn: spamtest initialized
Jan 1 22:42:12.221 [19522] dbg: learn: initializing learner
Jan 1 22:42:12.221 [19522] dbg: bayes: bayes journal sync starting
Jan 1 22:42:12.221 [19522] dbg: bayes: bayes journal sync completed
Jan 1 22:42:12.221 [19522] dbg: bayes: expiry starting
Jan 1 22:42:12.222 [19522] dbg: bayes: database connection established
Jan 1 22:42:12.222 [19522] dbg: bayes: found bayes db version 3
Jan 1 22:42:12.223 [19522] dbg: bayes: Using userid: 4
Jan 1 22:42:12.234 [19522] dbg: bayes: DB expiry: tokens in DB: 430, Expiry
max size: 150000, Oldest atime: 1546240630, Newest atime: 1546309979, Last
expire: 0, Current time: 1546378932
Jan 1 22:42:12.236 [19522] dbg: bayes: expiry completed
Jan 1 22:42:12.238 [19522] dbg: learn: learning ham
Jan 1 22:42:12.258 [19522] dbg: bayes: tokenized body: 3 tokens
Jan 1 22:42:12.258 [19522] dbg: bayes: tokenized uri: 0 tokens
Jan 1 22:42:12.258 [19522] dbg: bayes: tokenized invisible: 0 tokens
Jan 1 22:42:12.261 [19522] dbg: bayes: tokenized header: 159 tokens
Jan 1 22:42:12.355 [19522] dbg: bayes: seen
(6fbb589c1d2d27cf8a150d8345ff08c53ec827fa@sa_generated) put
Jan 1 22:42:12.356 [19522] dbg: bayes: learned
'6fbb589c1d2d27cf8a150d8345ff08c53ec827fa@sa_generated', atime: 1546309979
Note the line "dbg: learn: learning ham"
Numbers for nham/nspam from from "sa-learn --dump magic" confirm the message is
in fact learned as ham and not as spam as intended.
Messages seem to be learned correctly if learned via autolearn instead of
sa-learn script.
Bayes data is stored in a MySQL database backend if this should be relevant.
System is Gentoo Linux x86_64 with the latest distribution SpamAssassin package
(spamassassin-3.4.2-r2).
--
You are receiving this mail because:
You are the assignee for the bug.