> As of the current versions, there's no way to get a list of all the
> supported encodings, though I don't know why you want to know such.

I need to see if the encodings I am interested in (BIG-5, gb2312) are 
supported. According to the documentation those encodings are avaliable since 
PHP 4.3.0, but not always enabled. So, for PHP version 4.3.0-4.3.3 I need a 
way to determine their availability.

> What do you want to do exactly with this idea? I've never been in a 
> situation like that

This is an interesting situation, I am trying to make a search system capable 
of supporting multibyte languages. Currently (non-mutlibyte) systems works by 
breaking the text into individual words to be indexed. This unfortunately 
won't work for multibyte languages were there is rarely a space between 
'words'. The solution I am tinkering with, involves indexing the text by 
'characters', but to do that I need to good (fast & reliable) method of 
breaking a text into individual multibyte characters.
So far my solution has been to do this:

preg_match_all('!(\W)!u', iconv("BIG-5", "UTF-8", $str), $words);

Ilia

P.S. Please CC me on your replies, I am not subscribed to the list.

-- 
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to