Hi,

This is a recurring issue. The problem is that the server serves a resource, in 
this case text/html, without specifying its encoding. Today, when no encoding 
is specified, we default to UTF-8. In this case the server silently serves a 
resource which is ISO-8895-1 encoded.

The error is triggered by accessing the following URL:

ZnClient new get: 'http://squeaksource.com/ical/?C=M;O%3DD'; yourself.

If you inspect the response object inside the http client, you will see that 
the content-type is text/html. So Zn parses the incoming text using UTF-8 which 
fails (Zn encoders are strict by default).

Here is how to change the default during a call:

ZnDefaultCharacterEncoder
  value: ZnCharacterEncoder iso88591 
  during: [ ZnClient new get: 'http://squeaksource.com/ical/?C=M;O%3DD'; 
yourself ].

The solution would be that the server adds the proper charset specification.

Consider the default in Pharo:

ZnMimeType textHtml => text/html;charset=utf-8

The server should serve this resource using the following Content-Type:

text/html;charset=iso-8859-1

This is the server's responsibility. The page in question is the MC index page, 
which would normally be dynamically generated. Somewhere the server decides on 
the encoding. That encoding does not have to change, but it should be properly 
indicated in the HTTP response headers.

HTH,

Sven

> On 15 Mar 2017, at 17:42, David T. Lewis <[email protected]> wrote:
> 
> squeaksource.com is still running on a quite old image, and I know that it
> has problems with multibyte characters. If you are seeing problems related
> to this, it's not the fault of Zinc.
> 
> If you can confirm that this is what is happening, then I guess it is time
> to update that trusty old squeaksource.com image :-)
> 
> Dave
> 
>> On Wed, Mar 15, 2017 at 8:19 PM, Patrick R. <[email protected]> wrote:
>>> 
>>> Hi everyone,
>>> 
>>> I have been working on bringing http://squeaksource.com/ical/ up to
>>> speed
>>> for Squeak and wanted to make sure that it also works for Pharo.
>> Therefore,
>>> I have created a travis build job for Squeak and Pharo
>>> (https://travis-ci.org/codeZeilen/ical-smalltalk/jobs/211298950) which
>> pulls
>>> the source from squeaksource.com.
>>> 
>>> Now the issue is that loading the package in Pharo fails with a
>>> GoferException wrapping a ZnInvalidUTF8 Exception. We figured that this
>>> might be the result of the squeaksource page delivering the page as
>>> iso-8859-1 as it contains special characters. Any ideas on how to get
>>> this
>>> to work? I do not have access to the ical repository description and I
>> would
>>> like to avoid mirroring the whole repository on GitHub.
>> 
>> 
>> In a fresh 60437 image, in Playground evaluating...
>> 
>>  Metacello new
>>       configuration: 'ICal';
>>       repository: 'github://codeZeilen/ical-smalltalk:master/repository';
>>       onConflict: [:ex | ex allow];
>>       load.
>>  ==> Could not resolve: ICal-Core [ICal-Core-PaulDeBruicker.5] in
>> /home/ben/.local/share/Pharo/images/60437-01/pharo-local/package-cache
>> http://squeaksource.com/ical ERROR: 'GoferRepositoryError: Could not
>> access
>> http://squeaksource.com/ical: ZnInvalidUTF8: Illegal continuation byte for
>> utf-8 encoding'
>> 
>> 
>> In a new fresh 60437 Image (i.e. empty package-cache)
>>  World menu > Monticello > +Repository > squeaksource.com...
>>     MCSqueaksourceRepository
>>        location: 'http://squeaksource.com/ical'
>>        user: ''
>>        password: ''
>>   ==> open repository then errors "MCRepositoryError: Could not access
>> http://squeaksource.com/ical: ZnInvalidUTF8: Illegal continuation byte for
>> utf-8 encoding"
>> 
>> 
>> In Chrome, opening http://www.squeaksource.com/ical
>> then clicking <Versions>
>> and the browser's View Page Source,
>> I see...
>>   <?xml version="1.0" encoding="iso-8859-1"?>
>> 
>> Googling: zinc iso-8859-1
>> finds...
>> http://forum.world.st/Problem-using-Zinc-in-Pharo-4-Moose-5-1-td4825329.html
>> but "ZnByteEncoder iso88591"
>> errors with "KeyNotFound: key 'iso88591' not found in Dictionary"
>> and inspecting "ZnByteEncoder byteTextConverters keys sorted"
>> confirms this key is missing (@Sven, I'm curious why was this removed? )
>> 
>> 
>> Now https://en.wikipedia.org/wiki/ISO/IEC_8859-1
>> indicates IBM819 is an alias
>> and " ZnByteEncoder newForEncoding: 'ibm819' "
>> works okay
>> 
>> So in MCHttpRepository>>#loadAllFileNames
>> changing...
>>         queryAt: 'C' put: 'M;O=D' ;
>>         get.
>> to...
>>         queryAt: 'C' put: 'M;O=D' .
>>         ZnDefaultCharacterEncoder
>>              value: (ZnByteEncoder newForEncoding: 'ibm819')
>>              during: [client get].
>> 
>> Then from Monticello opening the previously defined
>> http://squeaksource.com/ical
>> works!!
>> 
>> 
>> Now I was hoping that reverting #loadAllFileNames
>> and in Playground doing...
>>    converters := ZnByteEncoder byteTextConverters.
>>    converters at: 'iso-8859-1' put: (converters at: 'ibm819').
>> might alleviate the problem, but no luck.
>> 
>> 
>> Anyone know a better way to deal with this that hardcoding the encoding
>> into #loadAllFileNames?
>> 
>> cheers -ben
>> 
> 
> 
> 


Reply via email to