Once loaded into a .net string, it is Unicode. So then the encoding stated in the Xml preamble (UTF-8) is then wrong. By passing a stream to the XmlDocument (directly, or via XmlTextReader(stream)) then the Encoding associated with the stream (which is sniffed from the BOM) is passed as well. If they match, much happiness.
A related point (and what Jano said) is by passing via a Stream surely the BOM is consumed to detect the encoding, and as a result is not passed to the reader? Either way, the moral is to avoid loading Xml from System.Strings. From: Greg Keogh Sent: Friday, January 31, 2014 5:09 AM To: ozDotNet Folks, I can't convert to byte buffer to an XML document. The input buffer contains a serialized string with a UTF8 BOM which starts like this: EFBBBF3C3F786D6C2076657273696F6E3D22312E302220656E636F64696E673D227574662D38223F3E0D0A3C5573 [cut] . . . < ? x m l v e r s i o n = " 1 . 0 " e n c o d i n g = " u t f - 8 " ? > . . < U s So I do this: string xml = Encoding.UTF8.GetString(buffer); var doc = XDocument.Parse(xml); However the Parse dies with "System.Xml.XmlException : Data at the root level is invalid. Line 1, position 1." What am I missing here? Greg K
