http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5431

           Summary: Method to test whether a piece of mail has already been
                    learned.
           Product: Spamassassin
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: spamassassin
        AssignedTo: [email protected]
        ReportedBy: [EMAIL PROTECTED]


I would like a method to test whether a piece of mail has already been learned, 
without just re-learning 
the mail).  Ideally this method would bd:

- be available as an option on sa-learn as well as a function for programatic 
use (see below)
- be able to indicate if the already-learned message was learned as spam or ham

this is from a post I made to the users list, and describes a potential use 
case:

>>>
> Try to learn it, if it comes back with something to the affect of:
> "learned from 0 messages, processed 1.." then it's already been  
> learned.

this seems to be the common suggestion.

it has a couple drawbacks, as i see it:

1.  it's relatively cpu-intensive if i want to do it all the time  
(e.g. scan my spam folder to learn only the messages which haven't  
already been learned)

2.  which way do i learn it.

to step back a bit, my final goal is to be able to figure out which  
messages in a folder haven't been learned, and learn only those.  in  
the ideal situation i can also figure out (ahead of time), whether a  
learned message was learned as ham or spam.

this may be semi-impossible.

on the other hand, what can i learn from the headers?

e.g. it looks like autolearn=[something] will tell me about the  
autolearner, but is there anything for manual learns?

where i'm going with all this:

i can run a cron job to learn the contents of different mailboxes on  
a regular basis.  what i do now is have a TrainSpam and TrainHam  
mailbox, and when something gets misfiled (in Spam or any ham folder)  
i just move it in there.  every 5 minutes a cron job goes through and  
scans things appropriately. <http://www.faisal.com/software/sa- 
harvest/quicktrain.html>

first, i'd like to be able to do that within the mailboxes rather  
than using special mailboxes.

second, i'd like to be able to key off junk mail flags set by the  
client (thunderbird, apple mail).  i'm using dovecot, so it's a  
fairly simple matter of parsing Maildir filenames, but to do it right  
i need to combine the knowledge with what spamassassin thinks.
<<<



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to