Zitat von Darren Cook <[EMAIL PROTECTED]>:
I need to write (PHP) code to detect the language of a given block of text. (For my purposes I want to initially distinguish between English, Japanese, German, Simplified Mandarin, Traditional Mandarin, Arabic, Korean, French) I want it to be reliable so my plan was to have a list of unicode points only found in each given language [1], and use that to return a high confidence answer. If none found, then have a list of high frequency words for each language [2] and use that to return a lower confidence answer.
http://pear.php.net/package/Text_LanguageDetect Jan. -- Do you need professional PHP or Horde consulting? http://horde.org/consulting/ -- PHP Unicode & I18N Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php