As of the current versions, there's no way to get a list of all the supported encodings, though I don't know why you want to know such.
I need to see if the encodings I am interested in (BIG-5, gb2312) are
supported. According to the documentation those encodings are avaliable since
PHP 4.3.0, but not always enabled. So, for PHP version 4.3.0-4.3.3 I need a
way to determine their availability.
Well, so basically there's no apparent solution for now... But you can
check whether a certain encoding is supported or not, by mb_internal_encoding()
or similar functions that take an encoding name for its argument. With those
functions you just have to see if the return value is false or not.
What do you want to do exactly with this idea? I've never been in a situation like that
This is an interesting situation, I am trying to make a search system capable
of supporting multibyte languages. Currently (non-mutlibyte) systems works by
breaking the text into individual words to be indexed. This unfortunately
won't work for multibyte languages were there is rarely a space between
'words'. The solution I am tinkering with, involves indexing the text by
'characters', but to do that I need to good (fast & reliable) method of
breaking a text into individual multibyte characters.
So far my solution has been to do this:
preg_match_all('!(\W)!u', iconv("BIG-5", "UTF-8", $str), $words);
Perhaps you can handle it with mb_split() when it comes to Japanese encodings,
though mbregex functions cannot deal with Chinese encodings for now. So
I think the solution you proposed is the best possible workaround.
BTW, I suppose separating a set of chinese strings into individual characters won't suffice, because lots of chinese words often occur as a compound of two or more letters. (The same thing applies to other multibyte languages.) You better refer to the codes out there that may be called as morphological analyser, if you really want to get to the right way. Things are not that simple at all.
P.S. Please CC me on your replies, I am not subscribed to the list.
Hmm, I think I did.. Maybe I'm not used to Apple mail client yet :)
Moriyoshi
-- PHP Internationalization Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php