Re: [Dspam-user] Dspam learning and limited factors

Julien Vehent Fri, 24 Sep 2010 10:10:28 -0700

On Fri, 24 Sep 2010 12:24:16 +0200, Julien Vehent <[email protected]> wrote:
> Hello Dspam list,
> 
> I'm currently testing a setup of dspam 3.9.1 and, after a few weeks of 
> running it, it seems that learning doesn't happen.
> I have the same X-DSPAM-Confidence: 0.6589 and X-DSPAM-Probability: 0.3411 
> over and over again...
> 
> I first assumed that, considering my low ratio of spam (around 1.5%, 
> greylisting is working well already), it was normal for Dspam to take some 
> time to populate the dictionary. But now, after testing the same email 
> several time and not seeing any change in the probability, I tend to doubt my 
> configuration.
> 
> Here are my statistics:
> 
> ----
> # dspam_stats [email protected]
> [email protected]  TP:     0 TN:  4175 FP:     0 FN:    47 SC:     0 NC:  
>    0
> 
> # dspam_dump [email protected] |wc -l
> 2226565
> 
> # dspam_dump [email protected] |head
> 16562406131549251947 S: 00000  I: 00001  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 6865589383638280368  S: 00000  I: 00001  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 9170080104461015804  S: 00000  I: 00003  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 5926792591516926582  S: 00000  I: 00003  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 14071509996661893476 S: 00000  I: 00001  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 12628252971455951009 S: 00000  I: 00022  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 8466811669071571900  S: 00000  I: 00003  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 5005764671929148649  S: 00000  I: 00001  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 15031205318376739124 S: 00000  I: 00001  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> 7946915292529287299  S: 00000  I: 00001  P: 0.5000 LH: Fri Sep 24 12:00:31 
> 2010
> terminated.
> ----
> 
> If I send myself a spam email contain the token 'viagra+spam', I receive the 
> email properly with confidence 0.6589 and probability 0.3411.
> 
> The token exist and is marked as innocent:
> 
> ----
> # dspam_dump [email protected] viagra+spam
> 3709864854363566654  S: 00000  I: 00001  P: 0.5000
> ----
> 
> Now, I relearn this email as spam using the web interface. The token is now 
> marked as spam
> 
> ----
> # dspam_dump [email protected] viagra+spam
> 3709864854363566654  S: 00001  I: 00000  P: 0.5000
> ----
> 
> Why hasn't the probability changed ?
> I resend the exact same email one more time. The token is updated, but 
> marking it as spam one more time doesn't change anything...
> 
> ----
> # dspam_dump [email protected] viagra+spam
> 3709864854363566654  S: 00001  I: 00001  P: 0.5000
> 
> === mark as spam ===
> # dspam_dump [email protected] viagra+spam
> 3709864854363566654  S: 00002  I: 00000  P: 0.5000
> 
> ----
> 
> The configuration is here: http://jve.linuxwall.info/dump/dspam.conf.html (or 
> without the .html for the text file).
> 
> 
> Did I do something wrong ? Is it due to using 'markov' for PValue ? Should I 
> switch back to bcr ?
> 
> 
> Thanks,
> Julien
> 
>


Alright, I changed markov for bcr and I know receive messages marked as spam... 
as expected.
So, what is it about markov ? did I miss something ? is the support broken ?

Thanks,
Julien

------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] Dspam learning and limited factors

Reply via email to