On 09 Apr 2014, at 19:35, itli...@schrievkrom.de wrote:

> Ok, forget the JSON stuff - it has nothing to do with the "problem".
> 
> Other way round:
> 
> My whole database and internal processing is done in UTF8. This is the
> most important point here to mention.

Why ? This means you forgo almost all String functionality, since UTF8 is a 
variable length encoding not really suitable to character by character 
processing.

> Now the request comes into Zinc as mentioned below (the content of the
> request is a JSON string only):
> 
>  HTML-Request (charset=UTF-8) =(sends)=> ZINC HTTP
> 
> Now Zinc sees the content of the body, knows that it is coded in UTF8
> and creates a ZnStringEntity with UTF8Encoder.
> 
>  Zinc HTTP =(builds)=> ZnStringEntity (with UTF8Encoder)
> 
> The instance of ZnRequest and its entity value is an instance of
> ZnStringEntity (with its encoder attribute is set to an instance to
> ZnUTF8Encoder).

Yes, of course, UTF-8 (a variable length binary encoding) is converted into 
native Pharo Strings (possibly WideStrings) containing Characters, each of 
which is encoded using a Unicode code point value.

> I checked the content of the string attribute of the ZnStringEntity and
> this string is NOT encoded in UTF8 any more, but in either ISO8859-?
> or WIN1252.

Here you lose me (again) ;-)

> I think, that this is ok for almost all people, because they work with
> some CodePages - but my internal processing assumes UTF8.

No, nobody works with code pages or any encoding, just native [Wide]Strings in 
pure Unicode.

> I just fixed this for me by changing ZnStringEntity>>initializeEncoder
> to ALWAYS set the encoder attribute to ZnNullEncoder and now everthing
> is ok again. This means of course, that all apllication running with
> that source code work in UTF 8 only ...

OK, I think I understand, you want UTF-8 to remain UTF-8. What you did is one 
solution, but I think it is wrong to use a String to represent bytes. 

This case is actually already implemented server side for Seaside:

ZnZincServerAdaptor>>#configureServerForBinaryReading
  "Seaside wants to do its own text conversions"

  server reader: [ :stream | ZnRequest readBinaryFrom: stream ]

The #reader: option is used here to read everything binary, without decoding to 
Strings. You will get ZnZnByteArrayEntity objects back, containing the original 
binary representation.

BTW, I think this is an interesting discussion.

Regards,

Sven

> Marten
> 
> Am 09.04.2014 18:42, schrieb Sven Van Caekenberghe:
>> Marten,
>> 
>> On 09 Apr 2014, at 18:25, itli...@schrievkrom.de wrote:
>> 
>>> Ok, if the browser sends POST/PUT request with a JSON structure it also
>>> sends charset = utf8 (in my case). That's ok, because for JSON this is
>>> more or less the default charset.
>>> 
>>> Zinc now seems to notice, that UTF8 charset is needed and creates a
>>> ZnStringEntity with an UTF8Encoder.
>>> 
>>> Now when my application tries to get the JSON string of that
>>> ZnStringEntity and builds the structure out of that string - and the
>>> strings are NOT UTF8, but converted to (?) ISO8859 ?
>> 
>> (NeoJSONReader fromString: 
>>  (ZnEntity with: (NeoJSONWriter toString: { #message -> 'An der schönen 
>> blauen Donau' } asDictionary)))
>>    at: #message.
>> 
>> You must be doing something possibly wrong when you <<get the JSON string of 
>> that ZnStringEntity and builds the structure out of that string>> (how do 
>> you do that, BTW), so please write some code that demonstrates what is not 
>> right according to you.
>> 
>> Sven
>> 
> 
> 
> -- 
> Marten Feldtmann
> 


Reply via email to