>
Linedata Services (UK) Ltd Registered Office: Bishopsgate Court, 4-12 Norton Folgate, London, E1 6DB Registered in England and Wales No 3027851 VAT Reg No 778499447 -----Original Message----- > From: John Lilley [mailto:[email protected]] > Sent: 14 August 2009 12:42 > To: [email protected] > Subject: RE: Invalid byte 1 (£) of a 1-byte sequence > > I will also quite likely say some thing stupid, but here goes :) > > I suggest that there are two possibilities: > > 1) Xerces on AIX is ignoring your request to use UTF-8, and > is instead using the default 8859-1 > 2) Xerces, or the underlying transcoder it uses, is > translating UTF-8, but is too lenient when it encounters the > invalid escape sequence, and makes some ad-hoc (or buggy) > attempt to convert the code anyway. > > I would suggest this experiment: feed the parser a document > containing the valid sequence (C2 A3) and see if it is parsed > correctly. If so, then the answer is most likely (2) else > (1). Armed with that information you can seek the > appropriate corrective action. It won't be easy but I'll give it a go and report back. In any case, that would be a bug in Xerces, wouldn't it? So the "approrpiate corrective action" would be to change my application to work around this bug, wouldn't it? Giulio
