Hi all,

I'm stuck with this problem: I am trying to convert a text with any kind of unicode characters to its octet and entity equivalents.

For example:
Ë is Ë as octet and Ë as entity
Đ is Đ as octet and Đ as entity

My code works fine for some characters ( Ë works fine, but Đ fails at entity encoding). Do you have a hint, how to solve this?

I try to get the octet encoding with mb_encode_numericentity which works fine for everything

$convmap = array(
0x22, 0x22, 0, 0xffff, # "
0x26, 0x27, 0, 0xffff, # &'
0x3c, 0x3c, 0, 0xffff, # <
0x3d, 0x3d, 0, 0xffff, # >
0x80, 0xffff, 0, 0xffff,
);
$oct_string = mb_encode_numericentity($test, $convmap, 'UTF-8');

I try to get all entity encodings with htmlentities

$entity_string = htmlentities($test, ENT_QUOTES, 'UTF-8');

But that fails for some characters like Đ. Is there a better way to get all entity encodings?

Thanks,
Sebastian

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to