>surely, it makes no sense blow up the database with already 100%
>classified samples - you even don't do that uncnditional with a
>hand-trained database (at least not forever, at the begin it makes sense
>to get additional tokens)

I think you misunderstood my question.  I meant that as I look at messages to 
see if I think they should have learned, or not, would one that shows 90-100 
spam likely per bayes likely be one that is being skipped due to already being 
recognized.  I am not asking why isn't it learning one like that.  Or maybe I 
misunderstood your answer.

>but train every single message which is already classified as expected
>would leat to a lot of useless load. blows up the database and makes
>bayes-poisioning and the need to purge the whole database and start from
>scratch (with thanks to autotraining no available corpus) then
>autolearning on it's down does

Agree.  And I understand that is not how it is designed.  

>the question of bayes-poisioning is not "if", it's "when and how often"
>and hence after 10 years expierience i stopped that nonsense and keep a
>currently 120000 messages large corpus of eml-files (HAM AND SPAM)

Not arguing the pros and cons of IF one should use it.

I only want to make it work, or better said, verify that it IS working.  Then I 
can decide if I want to keep using it.  Right now, I've never seen it work.  
Thus my strong suspicion that is is not working.  One thing for sure, it hasn't 
found a single spam or ham to auto-learn, yet.  Which seems unlikely if it were 
functioning properly.

The output of "unavailable" is too ambiguous for me to devise a way to 
troubleshoot.  But I'm not an expert with SA.  Thus the plea for assistance in 
seeing if it is working.  If auto-learn isn't working, my expectation is that 
auto-anything-else isn't working either.  Journal maint, etc.

Reply via email to