[Bug 7674] New: sa-learn learns all messages as ham even if --spam is specified

bugzilla-daemon Tue, 01 Jan 2019 13:56:07 -0800

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7674


            Bug ID: 7674
           Summary: sa-learn learns all messages as ham even if --spam is
                    specified
           Product: Spamassassin
           Version: 3.4.2
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Learner
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: Undefined

While learning messages with "sa-learn --spam" from a folder the messages are
in fact learned as ham instead of spam. 

Debug log:
Jan  1 22:42:12.185 [19522] dbg: bayes: learner_new
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x56193c244a38),
bayes_store_module=Mail::SpamAssassin::BayesStore::SQL
Jan  1 22:42:12.204 [19522] dbg: bayes: using username: XXX
Jan  1 22:42:12.204 [19522] dbg: bayes: learner_new: got
store=Mail::SpamAssassin::BayesStore::SQL=HASH(0x56193cd9a998)
Jan  1 22:42:12.217 [19522] dbg: bayes: database connection established
Jan  1 22:42:12.218 [19522] dbg: bayes: found bayes db version 3
Jan  1 22:42:12.218 [19522] dbg: bayes: Using userid: 4
Jan  1 22:42:12.219 [19522] dbg: bayes: not available for scanning, only 0
spam(s) in bayes DB < 200
Jan  1 22:42:12.221 [19522] dbg: sa-learn: spamtest initialized
Jan  1 22:42:12.221 [19522] dbg: learn: initializing learner
Jan  1 22:42:12.221 [19522] dbg: bayes: bayes journal sync starting
Jan  1 22:42:12.221 [19522] dbg: bayes: bayes journal sync completed
Jan  1 22:42:12.221 [19522] dbg: bayes: expiry starting
Jan  1 22:42:12.222 [19522] dbg: bayes: database connection established
Jan  1 22:42:12.222 [19522] dbg: bayes: found bayes db version 3
Jan  1 22:42:12.223 [19522] dbg: bayes: Using userid: 4
Jan  1 22:42:12.234 [19522] dbg: bayes: DB expiry: tokens in DB: 430, Expiry
max size: 150000, Oldest atime: 1546240630, Newest atime: 1546309979, Last
expire: 0, Current time: 1546378932
Jan  1 22:42:12.236 [19522] dbg: bayes: expiry completed
Jan  1 22:42:12.238 [19522] dbg: learn: learning ham
Jan  1 22:42:12.258 [19522] dbg: bayes: tokenized body: 3 tokens
Jan  1 22:42:12.258 [19522] dbg: bayes: tokenized uri: 0 tokens
Jan  1 22:42:12.258 [19522] dbg: bayes: tokenized invisible: 0 tokens
Jan  1 22:42:12.261 [19522] dbg: bayes: tokenized header: 159 tokens
Jan  1 22:42:12.355 [19522] dbg: bayes: seen
(6fbb589c1d2d27cf8a150d8345ff08c53ec827fa@sa_generated) put
Jan  1 22:42:12.356 [19522] dbg: bayes: learned
'6fbb589c1d2d27cf8a150d8345ff08c53ec827fa@sa_generated', atime: 1546309979

Note the line "dbg: learn: learning ham"

Numbers for nham/nspam from from "sa-learn --dump magic" confirm the message is
in fact learned as ham and not as spam as intended.

Messages seem to be learned correctly if learned via autolearn instead of
sa-learn script.

Bayes data is stored in a MySQL database backend if this should be relevant.

System is Gentoo Linux x86_64 with the latest distribution SpamAssassin package
(spamassassin-3.4.2-r2).

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7674] New: sa-learn learns all messages as ham even if --spam is specified

Reply via email to