Am 05.05.2012 20:24, schrieb James Turner: >>> Looking at imatt.xml, this line 65 is inside a comment<!-- and >>> around column 31 is a French char, e acute (0xe8) but unsure why >>> the parser should be bothered by such a hi-bit char in a comment >>> field... > I updated the XML parser to fix some crashes related to (valid) XML > BOM markers - and I've other crashes previously due to unrecognised > encodings. Depending on the declared encoding of this file (UTF-8?), > it's possible this really needs an update, but I haven't investigated > yet.
That's indeed the case. And the parser is right to complain: immat.xml does not specify any character encoding, so the default applies, which is UTF-8. However, 0xe8 followed by 't' is not a valid UTF-8 sequence - so parsing fails. Using 0xe8 for "e acute" means the Latin1 character set is used (ISO 8859-1), so switching the parser to this encoding fixes the issue: <?xml version="1.0" encoding="ISO-8859-1" ?> Alternatively, the "e acute" needs to be removed - or be encoded as a valid UTF-8 character (which would be "0xc8 0xa8" ;-) ). We probably should double check existing XML files in fgdata. Maybe we can run a script which reads all fgdata XML files - so we can tell which files are affected and need to be fixed... cheers, Thorsten ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel