On Mon, 2 Aug 2010 12:12:08 +0200 (CEST), "Imposit.com - webmaster"
<webmas...@imposit.com> wrote:
> | LOL. I am not going to publish that. I have made that script for me
> | mainly
> | because I wanted to use TONE (Train On error or Near Error) and the
> | script
> | is done in such a way that I can handle it. Publishing it would expose
> | me
> | to more (unnecessary) questions from people trying to use the script
> | while
> | not understanding anything about the mathematical topic the script is
> | dealing with. This is something that I want to avoid.
> | 
> | 
> 
> understandable... to bad :-)
> 
It's nothing personal against you. Not at all. It is just that I don't
want again to burn my fingers with such a delicate topic. Most users of
DSPAM out there are happy with the way how DSPAM works and they are happy
with the training script.

I could now go on and publish my script but the chance that they shoot in
their own foot is almost 100%. And I don't have the nerves nor the time to
explain the complex topic of machine learning and artificial intelligence.
So I just leave it. It works for me. It does what I need. And the currently
available training script just works and does what users/admins want: It
trains their DSPAM.

So why risking to confuse them with something new, with which they could
with high confidence shoot in their own foot?



> | The other issue is that SPAM in it's core is easy to find. A bunch of
> | SPAM
> | messages are enough for something like OSB. But HAM on the other hand
> | can
> | be very complex and diverse. In order to have good catching rate you
> | should
> 
> 
> Dot worry im aware of that. 
> Plan for ham was (with user aproval) taking a copy of one week mail of
the
> top 50 users and feed it as ham.
>
Don't just take inbound mail. If you have the possibility then take
outbound too.


> same i wanted to to with the quarantine same users one week
> then train them to the globaluser
> 
> we also have a very very low Filter sensitivity. Its always better
getting
> spam then loosing a mail
> 
DSPAM is not going to be the reason for lost mail, since DSPAM does not
block but only filter/tag.


> btw i used th emerged group / globalusersetup myself since the beginning
> but my database is far far bigger than 400 mb :-) must be the chain
thing
> 
Do you clean that beast? You should!
How have you trained that merged group? With TEFT? TOE? What have you
trained?
I have trained from the beginning in TONE mode and I have used 'boosting'
(don't ask me now what this is. Use Google and friends to find out) while
training. And in each round I have dropped older data from two or three
rounds before. That has helped me to amplify my tokens and purge the low
signal tokens and therefore keeping my dataset small and compact.


-- 
Kind Regards from Switzerland,

Stevan Bajić

------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Dspam-devel mailing list
Dspam-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Reply via email to