The default encoding for the PHP manual is ISO-8859-1. Some other languages get special encodings (you can find these in configure.in, around line 655). This gives problems for some common languages. If you take a look at the French translation of the echo function [ http://php.net/manual/fr/function.echo.php ], you see the character entities (like é) in the code example aren't replaced by the correct character.
This is because in the XML source file, code examples appear inside <![CDATA[ ]]> sections, and entities aren't read there. If we could place the correct character instead of the entity there (and in the rest of the manual), these problems wouldn't occur.
I have noticed that this gives problems if the compilation happens in the default ISO-8859-1 encoding, but not with UTF-8.
What would be the implication of changing the default encoding from ISO-8859-1 to UTF-8? What would break? What changes in the source files would be needed?
I have searched the list archives for an earlier discussion of this subject, but couldn't find anything.
I don't see why there is a need to change the default encoding. Every translation has the freedom to use their own encoding. We at the hungarian team use the iso-8859-2 encoding, which enables us to add chars in their "native" one char representation, instead of using entities. There is no need to change the default encoding, multiple XML files with multiple encodings live quite well by side of each other.
There are also much more tools to work with iso-8859-1 then UTF-8, and the English docs should be the "most editable one" (no need to get more tools to work with it)...
Goba
-- PHP Documentation Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
