Heddon's Gate Hotel wrote:
Thanks Jan, it's much clearer now. My knowledge about character encodings has multiplied 100-fold in the last 24 hours' research.

Would it be a good idea for the PHP Manual to address some of these issues, by explaining good practice in encoding arbitrary user input in forms (for example), for the benefit of those, like me, for whom character sets are a bit of a black art?

Also I still cannot persuade get_html_translation_table to list those non-Latin1 entities. This is not an important issue, since it appears to be only an information function, but it would be nice if it were consistent with htmlentities and html_entity_decode.

This probably one of the reasons some of us think that getting a stable PHP6 based on unicode out of the door would probably be a lot more use to people than PHP5.3 ;)
Eliminate character sets and the black art goes away?

