Am 05.05.2012 20:24, schrieb James Turner:
>>> Looking at imatt.xml, this line 65 is inside a comment<!-- and
>>> around column 31 is a French char, e acute (0xe8) but unsure why
>>> the parser should be bothered by such a hi-bit char in a comment
>>> field...
> I updated the XML parser to fix some crashes related to (valid) XML
> BOM markers - and I've other crashes previously due to unrecognised
> encodings. Depending on the declared encoding of this file (UTF-8?),
> it's possible this really needs an update, but I haven't investigated
> yet.

That's indeed the case. And the parser is right to complain:

immat.xml does not specify any character encoding, so the default 
applies, which is UTF-8. However, 0xe8 followed by 't' is not a valid 
UTF-8 sequence - so parsing fails.

Using 0xe8 for "e acute" means the Latin1 character set is used (ISO 
8859-1), so switching the parser to this encoding fixes the issue:

<?xml version="1.0" encoding="ISO-8859-1" ?>

Alternatively, the "e acute" needs to be removed - or be encoded as a 
valid UTF-8 character (which would be "0xc8 0xa8" ;-) ).

We probably should double check existing XML files in fgdata. Maybe we 
can run a script which reads all fgdata XML files - so we can tell which 
files are affected and need to be fixed...

cheers,
Thorsten

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel

Reply via email to