On Tue, 30 Sep 2014, terrygalant.li...@fastest.cc wrote:

On Tue, Sep 30, 2014, at 10:57 AM, John Hardin wrote:
Did you run the above as user zimbra?

This time it is,

su - zimbra
$ /opt/zimbra/libexec/sa-learn --dbpath /opt/zimbra/data/amavisd/.spamassassin 
--dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0       1150          0  non-token data: nspam
0.000          0      19454          0  non-token data: nham
0.000          0     120765          0  non-token data: ntokens

You might want to verify the "nspam" and "nham" numbers change as you feed
in messages for training.

Hm.  I resubmitted a bunch that'd been trained, and saw no change

You wouldn't for messages already learned, unless you changed their classification (e.g. was initially learned as ham, relearned as spam, would relocate token counts from nham to nspam).

Doing it manually, outside of zimbra, I exported to text, then re-submitted 15 
messages

By "re-submitted" you mean messages that were in the Zimbra training corpora folder?

/opt/zimbra/libexec/sa-learn --dbpath /opt/zimbra/data/amavisd/.spamassassin 
--dump magic

/opt/zimbra/libexec/sa-learn --dbpath /opt/zimbra/data/amavisd/.spamassassin 
--no-sync --spam /opt/zimbra/TEMP/spam/*
        Learned tokens from 15 message(s) (15 message(s) examined)

"Learned tokens" with a non-zero number suggests they *haven't* been learned before. If you saw "learned tokens from 0 messages (15 examined)" that'd indicate bayes had already seen the messages.

/opt/zimbra/libexec/sa-learn --dbpath /opt/zimbra/data/amavisd/.spamassassin 
--sync
        bayes: synced databases from journal in 1 seconds: 2189 unique entries 
(2189 total entries)

Now appears to have changed

/opt/zimbra/libexec/sa-learn --dbpath /opt/zimbra/data/amavisd/.spamassassin 
--dump magic
        0.000          0          3          0  non-token data: bayes db version
        0.000          0       1165          0  non-token data: nspam
        0.000          0      19454          0  non-token data: nham

That suggests Zimbra *isn't* actually learning the messages being submitted to the training mailbox.

Try submitting some new ham to the training mailbox and see if the numbers change. (Note: I don't know *when* Zimbra does that learning, or whether it can be initiated manually if the learning is on a schedule.)

Are there any logs indicating that Zimbra has attempted to learn messages, any indicating it succeeded?

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  All I could think about was this bear is so close to me I can
  see its teeth. I could have kissed it. I wished I had a gun.
                                             -- Alyson Jones-Robinson
-----------------------------------------------------------------------
 4 days until the 10th anniversary of SpaceshipOne winning the X-prize

Reply via email to