Once loaded into a .net string, it is Unicode. So then the encoding stated in 
the Xml preamble (UTF-8) is then wrong. By passing a stream to the XmlDocument 
(directly, or via XmlTextReader(stream)) then the Encoding associated with the 
stream (which is sniffed from the BOM) is passed as well. If they match, much 
happiness.




A related point (and what Jano said) is by passing via a Stream surely the BOM 
is consumed to detect the encoding, and as a result is not passed to the reader?




Either way, the moral is to avoid loading Xml from System.Strings.





From: Greg Keogh
Sent: ‎Friday‎, ‎January‎ ‎31‎, ‎2014 ‎5‎:‎09‎ ‎AM
To: ozDotNet






Folks, I can't convert to byte buffer to an XML document. The input buffer 
contains a serialized string with a UTF8 BOM which starts like this:




EFBBBF3C3F786D6C2076657273696F6E3D22312E302220656E636F64696E673D227574662D38223F3E0D0A3C5573
 [cut]
. . . < ? x m l   v e r s i o n = " 1 . 0 "   e n c o d i n g = " u t f - 8 " ? 
> . . < U s 





So I do this:




string xml = Encoding.UTF8.GetString(buffer);

var doc = XDocument.Parse(xml);




However the Parse dies with "System.Xml.XmlException : Data at the root level 
is invalid. Line 1, position 1."




What am I missing here?




Greg K

Reply via email to