On Thursday, 29 August 2013 at 17:38:43 UTC, Jonathan M Davis wrote:

Well, as I said, I couldn't remember exactly what the XML standard said about encodings, but if it can contain non-ASCII characters, then my first inclination is to say that it has to be UTF-8, UTF-16, or UTF-32 based on the fact that that's what we support in the language and in Phobos (as I understand it, std.encodings is a bit of a joke that needs to be rethought and replaced, but regardless, it's the only Phobos module supporting any non-
Unicode encodings).

However, because all of the XML special symbols should be ASCII, you should still be able to avoid decoding characters for the most part. It's only when you have to actually look at the content that Unicode would potentially matter. So, the performance hit of decoding Unicode characters should mostly
be able to be avoided.

- Jonathan M Davis

You just specify the encoding in the root element.

<?xml version="1.0" encoding="us-ascii"?>
<?xml version="1.0" encoding="windows-1252"?>
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="UTF-16"?>

UTF-8 is the default in lieu of a BOM saying otherwise.

Reply via email to