On 01.05.2012 02:34, Chad M Stewart wrote:
> On Apr 30, 2012, at 7:18 PM, Stevan Bajić wrote:
>
>> On 30.04.2012 21:29, Chad M Stewart wrote:
>>> [...]
>>> In my model I have zero access to end user's mailboxes.  Thus why I was 
>>> thinking that training via web interface or via email would work.  You're 
>>> right a single button is easier.  To get that in my scenario means I'll 
>>> have to hire someone to write a plugin for Outlook.  Then the Exchange user 
>>> can click one button and have the message automatically forwarded off to 
>>> the single address for fixing.  I could have plugins written for other 
>>> clients as well.  Though that could be a support nightmare.
>>>
>> I consider supporting Outlook client to be a support nightmare. Every
>> version has it's own way when you want to access data.
> AGREED!!!  Scares the hell out of me too, but if I had to do it, I'd do it.  
> Though I hope I can come up with a much more elegant solution.
Don't be scared. Your life is not depended on it. It is just a peace of 
code. If it generates an income for you or makes your professional life 
easier then you should not be afraid of it.


>> btw: Why reinventing the wheel? I know that there are open sourced .Net
>> code that has exactly that what you need. One such tool is SpamGrabber
>> ->  http://www.spamgrabber.org/
>>
>> It's not finished but for a good coder that should take a bunch of hours
>> to fix that and make it working. Probably a weekend job.
>>
>>
>>> [...]
>>>
>>>
>>> I agree and am using restriction classes within postfix.  I have 100+ 
>>> addresses that are inoculation caliber addresses.  Those addresses have 
>>> minimal smtp level checks enforced, so they get the maximum amount of spam. 
>>>  They are opted out of dspam.  Mail to them is saved off to a directory 
>>> where I'll have a script pickup and pass to dspam as an inoculation message 
>>> for the global user I've defined, who is then in a merged group.
>> You can't teach a kid with information on what is bad and then expect
>> the kid to be prepared to know what is good. You need to show both side
>> of the medal. The good and the bad. So just feeding DSPAM with mostly
>> Spam mail will ruin your setup. You need Ham mail too. If you follow my
>> training advices from before then you will need way more Ham than Spam
>> because all the nice features like whitelisting and such will not cover
>> your back. Ham is usually way more diverse than Spam. Therefore you need
>> more Ham than Spam.
> Right.  In my email account I've got 250K msgs, which is probably 95% (or 
> greater) ham.   I don't always clean out the spams that come in from lists 
> like this.
You could use something called boosting to clean up those 250K messages. 
What you can do is you copy the 250K messages to another place. Then you 
run a script where you classify the messages into Spam and Ham. Then you 
go on and verify each classified Spam message if it is really Spam. If 
it is Ham/FP then you train that message or you inoculate that message. 
If it is really Spam then you move the message to another folder. You do 
this till you have all the Spam messages verified (the one classified as 
Spam in the first run). Then you repeat that until DSPAM tells you that 
there is no Spam messages left.

This should clean out the most Spam messages from the 250K messages.

If you really want to clean more then you could use SpamAssassin or 
CRM114 or any other Anti-Spam filter and check if you find even more 
Spam messages in the Ham messages from above (that should be already 
pretty well cleaned by DSPAM).

When you do that you should disable RBL and such things (URIBRL, Razor, 
Pyzor, DCC, etc. Just every online check) because they will lead to 
false positive/negative if your data is not ultra fresh.

If you want to use it for training then you should as well remove news 
letters and other corner edge data/messages.

> -Chad
>
>
>
>


-- 
Kind Regards from Switzerland,

Stevan Bajić


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to