Feature Requests item #1206807, was opened at 2005-05-23 00:33 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=498106&aid=1206807&group_id=61702
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Priority: 5 Submitted By: Matt (matthew_levine) Assigned to: Nobody/Anonymous (nobody) Summary: "Trojan text" Initial Comment: Some spam will have long sections of text from random sources, such as excerpts of classic novels or books of quotes, so there will be lots of normal, i.e. hammy, words to get the spam past filters. The spam content will consist of urls and possibly images. An obvious solution would be to search the urls for spam clues, and you already have this as an experimental feature. However, that feature only works for emails that are below a certain threshold of tokens, and the phony text could easily put it over that threshold. So I suggest that either the feature should be able to check urls in all messages, or it could also kick in when some conditions are fulfilled that indicate the likely presence of "Trojan text," such as a high number of ham words along with linked images. Additionally, I suggest that when this feature causes a message to be registered as spam, SpamBayes should not be spam-trained on the "Trojan text," because it was inserted specifically to throw off spam filters, so the filter should work better if it's ignored. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=498106&aid=1206807&group_id=61702 _______________________________________________ Spambayes-bugs mailing list [email protected] http://mail.python.org/mailman/listinfo/spambayes-bugs
