|
Matt Kettler wrote: Marc Perkel wrote:I'm wondering if the language detection in TextCat can be improved. Here's the situation. Any chance someone might be interested in a radical redesign? I think language exclusion would be an extremely effective spam deterrent as email in a language you don't speak is definitely spam. Doesn't Linux come with spelling dictionaries of words for a lot of languages that are somehow hashed for speed for spell checking lookups? Except for very short messages I would think that if you spell checked the message in several languages and found that 80% was spelled correctly that you have a match. You wouldn't have to check every language, just start with some common ones and if you don't match them go to less common ones. Would something like this be doable? |
- Language detection in TextCat Marc Perkel
- Re: Language detection in TextCat Matt Kettler
- Re: Language detection in TextCat Henrik K
- Re: Language detection in TextCat Marc Perkel
- Re: Language detection in TextCat LuKreme
- Re: Language detection in TextCat Martin Gregorie
- Re: Language detection in TextCat Marc Perkel
- RE: Language detection in TextCat R-Elists
- Re: Language detection in TextCat Matus UHLAR - fantomas
- Re: Language detection in TextCat Matus UHLAR - fantomas
