I have an email that while filtered by Spam Assassin, receives a significant amount of spam. From what I can tell, I could exclude almost all of it if I could filter on the primary language of the email being sent. In my case, I could exclude anything that isn't English. Another cheater technique for my case might be to detect utf8 and block if it's seen in the message body.
I don't know if this has been discussed before, but would it be a sane thing to add to Spam Assassin to do language detection and somehow affect spam score based on this? Personally I'd be happy with a header row indicating the primary detected language. I've done a quick search and the only way I know of would be to pump all my message bodies through google, which I'm very unexcited about doing. Has anyone implemented a solution like this? Is this something that might be a useful addition to Spam Assassin? Thanks, Todd
