Hi,

Encouraged by one of the developers, I have decided to post this question on 
this forum to see what you think about the idea.

As DSpam can categorise email as spam/nospam it should also be able to do the 
same with html.

The idea is to use a proxy like Squid, pass each web response to an icap server 
like c-icap and have c-icap pass the actual html content to DSpam using a 
native c++ function call.

We would be looking for DSpam to categorise the content into a few major 
categories such as Adult/Shopping/Music etc.

This is a major deviation from email scanning but I believe the actual process 
should be very similar.

There will be some code changes required as DSPam will expect content in the 
form of an email message with email headers etc and HTML is a bit different.

The 2 major challenges I suspect are that;

a) HTML requires multiple categories and mail only needs spam/nospam
b) Real Time HTML processing requires the classification to be done a few 
milliseconds (max 50/60ms or so), mail is less sensitive.

Am I crazy for trying this?

Thanks

Daniel



------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to