> I need to write (PHP) code to detect the language of a given block of
> text.

Your proposed approach is very simplistic and probably won't be extensible
if it works at all. Usually a statistical approach is taken using groups of
characters.

In any case, ICU has this. See 

http://icu-project.org/userguide/charsetDetection.html

It has both charset and language detection. This is also available via Win32
and .NET APIs in case that helps at all.

If you roll your own you might want to be aware that there are a lot of
patents in this area. 

=Ed



-- 
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to