it seems that there is no parameter for the function h() (html_escape()) to indicate the character encoding being used?
for PHP, its htmlspecialchars() function has a dozen encoding possible, such as UTF-8, Chinese Big5, Chinese GB, Russia, Japanese. i think thought, h() will work for UTF-8, since h() will only touch the 4 special characters < > & " and replace them with < etc and those 4 characters are all in the 0x00 to 0x7F range, and h() will leave the other bytes intact (unchanged). Now, since a character in UTF-8 can be 1 to 4 bytes, and that any ASCII will be represented as 1 byte, which is 0x00 to 0x7F itself, and that 0x80 to 0xFF and other unicode characters will be 2 to 4 bytes long, but with the 1st to 4th bytes all being in the 0x80 to 0xFF range (see UTF-8 http://en.wikipedia.org/wiki/Utf-8 ), so when h() replaces those 4 ASCII characters, it will successfully do so when h() sees those 4 characters as a 1-byte character, and then it will bypass all the 1st to 4th bytes characters because those characters are in the 0x80 to 0xFF range, and therefore can never be matched as one of those 4 special characters, so the job of replacing those 4 characters will be done with no side effect whatsoever done to the non-ASCII characters. -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---

