On Thursday, 29 August 2013 at 17:38:43 UTC, Jonathan M Davis
wrote:
Well, as I said, I couldn't remember exactly what the XML
standard said about
encodings, but if it can contain non-ASCII characters, then my
first
inclination is to say that it has to be UTF-8, UTF-16, or
UTF-32 based on the
fact that that's what we support in the language and in Phobos
(as I
understand it, std.encodings is a bit of a joke that needs to
be rethought and
replaced, but regardless, it's the only Phobos module
supporting any non-
Unicode encodings).
However, because all of the XML special symbols should be
ASCII, you should
still be able to avoid decoding characters for the most part.
It's only when
you have to actually look at the content that Unicode would
potentially
matter. So, the performance hit of decoding Unicode characters
should mostly
be able to be avoided.
- Jonathan M Davis
You just specify the encoding in the root element.
<?xml version="1.0" encoding="us-ascii"?>
<?xml version="1.0" encoding="windows-1252"?>
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="UTF-16"?>
UTF-8 is the default in lieu of a BOM saying otherwise.