I know what the problem is and will have it fixed shortly. Thanks for the report.
> Sent: Monday, October 09, 2017 at 9:03 AM > From: "Peter Kenny" <pe...@pbkresearch.co.uk> > To: pharo-users@lists.pharo.org > Subject: Re: [Pharo-users] Problem with input to XML Parser - 'Invalid UTF8 > encoding' > > Correction - I am misrepresenting Sven. What he said was that Zinc would not > look inside the HTML <head> node to find out about coding. It would of > course use information in the HTTP headers, if any. > > > Peter Kenny wrote > > Henry > > > > Thanks for the explanations. It's a bit clearer now. I'm still not sure > > about how ZnUrl>>retrieveContents manages to decode correctly in this > > case; > > I'm sure I recall Sven saying it didn't (and in his view shouldn't) look > > at > > the HTTP declarations in the header. There is also the mystery of how the > > string reader in the XML-Parser package (XMLURI>>get) does the same trick, > > when it is presumably what XMLHTMLParser>>parseURL: uses and fails. > > > > However, all these are second order problems. It all begins because the > > Corriere web site does strange things with encoding, including using a > > UTF8 > > character in a page coded with 8859-1, as Paul pointed out. In any case, > > reading the page as a string and then parsing it solves my problem, so I > > shall stick to that as a standard procedure. Most importantly, I don't > > think > > there is any indication of a problem in the XML package for Monty to worry > > about. > > > > Thanks again > > > > Peter > > > > > > > > -- > > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html > > > > > > -- > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html > >