On Thu, Mar 16, 2017 at 1:25 AM, Sven Van Caekenberghe <[email protected]> wrote:
>
> Hi,
>
> This is a recurring issue.


It would be cool if some magic(TM) could raise a dialog with an
explanation and pull-down list to select an encoding - but maybe that
is too much hand holding.


>
> The problem is that the server serves a resource, in this case text/html, 
> without specifying its encoding.

I just bumped into [1] while browsing around to learn more, but I
don't know fully how to interpret it.
What do you make of it saying "An XHTML5 document is served as XML and
has XML syntax. XML parsers do not recognise the encoding declarations
in meta elements. They only recognise the XML declaration. Here is an
example:
    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE html ....

compared to the page having...
    <?xml version="1.0" encoding="iso-8859-1"?>

cheers -ben

[1]    https://www.w3.org/International/questions/qa-html-encoding-declarations


>
> Today, when no encoding is specified, we default to UTF-8. In this case the 
> server silently serves a resource which is ISO-8895-1 encoded.
>
> The error is triggered by accessing the following URL:
>
> ZnClient new get: 'http://squeaksource.com/ical/?C=M;O%3DD'; yourself.
>
> If you inspect the response object inside the http client, you will see that 
> the content-type is text/html. So Zn parses the incoming text using UTF-8 
> which fails (Zn encoders are strict by default).
>
> Here is how to change the default during a call:
>
> ZnDefaultCharacterEncoder
>   value: ZnCharacterEncoder iso88591
>   during: [ ZnClient new get: 'http://squeaksource.com/ical/?C=M;O%3DD'; 
> yourself ].
>
> The solution would be that the server adds the proper charset specification.
>
> Consider the default in Pharo:
>
> ZnMimeType textHtml => text/html;charset=utf-8
>
> The server should serve this resource using the following Content-Type:
>
> text/html;charset=iso-8859-1
>
> This is the server's responsibility. The page in question is the MC index 
> page, which would normally be dynamically generated. Somewhere the server 
> decides on the encoding. That encoding does not have to change, but it should 
> be properly indicated in the HTTP response headers.
>
> HTH,
>
> Sven
>
> > On 15 Mar 2017, at 17:42, David T. Lewis <[email protected]> wrote:
> >
> > squeaksource.com is still running on a quite old image, and I know that it
> > has problems with multibyte characters. If you are seeing problems related
> > to this, it's not the fault of Zinc.
> >
> > If you can confirm that this is what is happening, then I guess it is time
> > to update that trusty old squeaksource.com image :-)
> >
> > Dave
> >
> >> On Wed, Mar 15, 2017 at 8:19 PM, Patrick R. <[email protected]> wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> I have been working on bringing http://squeaksource.com/ical/ up to
> >>> speed
> >>> for Squeak and wanted to make sure that it also works for Pharo.
> >> Therefore,
> >>> I have created a travis build job for Squeak and Pharo
> >>> (https://travis-ci.org/codeZeilen/ical-smalltalk/jobs/211298950) which
> >> pulls
> >>> the source from squeaksource.com.
> >>>
> >>> Now the issue is that loading the package in Pharo fails with a
> >>> GoferException wrapping a ZnInvalidUTF8 Exception. We figured that this
> >>> might be the result of the squeaksource page delivering the page as
> >>> iso-8859-1 as it contains special characters. Any ideas on how to get
> >>> this
> >>> to work? I do not have access to the ical repository description and I
> >> would
> >>> like to avoid mirroring the whole repository on GitHub.
> >>
> >>
> >> In a fresh 60437 image, in Playground evaluating...
> >>
> >>  Metacello new
> >>       configuration: 'ICal';
> >>       repository: 'github://codeZeilen/ical-smalltalk:master/repository';
> >>       onConflict: [:ex | ex allow];
> >>       load.
> >>  ==> Could not resolve: ICal-Core [ICal-Core-PaulDeBruicker.5] in
> >> /home/ben/.local/share/Pharo/images/60437-01/pharo-local/package-cache
> >> http://squeaksource.com/ical ERROR: 'GoferRepositoryError: Could not
> >> access
> >> http://squeaksource.com/ical: ZnInvalidUTF8: Illegal continuation byte for
> >> utf-8 encoding'
> >>
> >>
> >> In a new fresh 60437 Image (i.e. empty package-cache)
> >>  World menu > Monticello > +Repository > squeaksource.com...
> >>     MCSqueaksourceRepository
> >>        location: 'http://squeaksource.com/ical'
> >>        user: ''
> >>        password: ''
> >>   ==> open repository then errors "MCRepositoryError: Could not access
> >> http://squeaksource.com/ical: ZnInvalidUTF8: Illegal continuation byte for
> >> utf-8 encoding"
> >>
> >>
> >> In Chrome, opening http://www.squeaksource.com/ical
> >> then clicking <Versions>
> >> and the browser's View Page Source,
> >> I see...
> >>   <?xml version="1.0" encoding="iso-8859-1"?>
> >>
> >> Googling: zinc iso-8859-1
> >> finds...
> >> http://forum.world.st/Problem-using-Zinc-in-Pharo-4-Moose-5-1-td4825329.html
> >> but "ZnByteEncoder iso88591"
> >> errors with "KeyNotFound: key 'iso88591' not found in Dictionary"
> >> and inspecting "ZnByteEncoder byteTextConverters keys sorted"
> >> confirms this key is missing (@Sven, I'm curious why was this removed? )
> >>
> >>
> >> Now https://en.wikipedia.org/wiki/ISO/IEC_8859-1
> >> indicates IBM819 is an alias
> >> and " ZnByteEncoder newForEncoding: 'ibm819' "
> >> works okay
> >>
> >> So in MCHttpRepository>>#loadAllFileNames
> >> changing...
> >>         queryAt: 'C' put: 'M;O=D' ;
> >>         get.
> >> to...
> >>         queryAt: 'C' put: 'M;O=D' .
> >>         ZnDefaultCharacterEncoder
> >>              value: (ZnByteEncoder newForEncoding: 'ibm819')
> >>              during: [client get].
> >>
> >> Then from Monticello opening the previously defined
> >> http://squeaksource.com/ical
> >> works!!
> >>
> >>
> >> Now I was hoping that reverting #loadAllFileNames
> >> and in Playground doing...
> >>    converters := ZnByteEncoder byteTextConverters.
> >>    converters at: 'iso-8859-1' put: (converters at: 'ibm819').
> >> might alleviate the problem, but no luck.
> >>
> >>
> >> Anyone know a better way to deal with this that hardcoding the encoding
> >> into #loadAllFileNames?
> >>
> >> cheers -ben
> >>
> >
> >
> >
>
>

Reply via email to