Fastream Technologies wrote:
> Hello Arno,
> 
> If the function is ready, I would like to test it in our special unit
> for HTML folder listings. Can you post it here or send privately?

Yes, I checked it in already. Install the TortoiseSVN client to get 
access to the ICS SVN repository.

In Icsv6 URL encoding/decoding do not use UTF-8. The file names in
directory listings are now _displayed_ correctly, but their links
have not changed, they still are ANSI. I can easily change URL coding
in v6 to UTF-8 as well, however I wonder whether that would break 
existing applications?

--
Arno Garrels


> 
> Best Regards,
> SZ
> On Fri, Oct 10, 2008 at 4:41 PM, Arno Garrels <[EMAIL PROTECTED]>
> wrote: 
> 
>> Arno Garrels wrote:
>>> Francois PIETTE wrote:
>>>>> But 3 bytes looks like UTF-8 ?
>>>> 
>>>> I don't know. You said it was UTF-16 if not encoded.
>>> 
>>> I installed IIS 7 on my Vista box and I found that IIS 7
>>> uses UTF-7 in directory listings.
>> 
>> Arrgh, typo above, IIS v7 uses UTF-8 of course!
>> 
>>> The HTTP header contains
>>> the "charset=UTF-8" content-type extension.
>>> 
>>> 
>>> However I think the ICS server should continue to use HTML
>>> enitities.
>>> HTML entities represent both iso-8859-1 (Latin1) and Unicode
>>> character numbers (in Unicode the first 256 chars are the same as
>>> Latin1). So in order to create a _valid_ mapping a AnsiString MUST
>>> be converted with current ANSI code page to a
>>> UnicodeString/WideString first! This can be achieved easily in
>>> TextToHtmlText() by a local WideString variable that is assigned
>>> parameter Src : String. Characters above #255 must the be
>>> represented as numerical HTML entities (&#nnnn;). That's all, fully
>>> backwards compatible and 
>>> works in D2009 as well :)
>>> 
>>> --
>>> Arno Garrels
>>> 
>>> 
>>>> 
>>>> ----- Original Message -----
>>>> From: "Arno Garrels" <[EMAIL PROTECTED]>
>>>> To: "ICS support mailing" <twsocket@elists.org>
>>>> Sent: Thursday, October 09, 2008 7:03 PM
>>>> Subject: Re: [twsocket] HTML encoding in HttpSrv func.
>>>> TextToHtmlText()
>>>> 
>>>> 
>>>>> Francois PIETTE wrote:
>>>>>>> The twothird character is not 'encoded' either as "&#8532;"
>>>>>>> (decimal) or as "&#x2154;" (hex)? If so, IIS sends plain UTF-16!
>>>>>> 
>>>>>> Yes, no encoding at all. Just the 3 bytes. So UTF-16.
>>>>> 
>>>>> But 3 bytes looks like UTF-8 ?
>>>>> 
>>>>> --
>>>>> Arno Garrels
>>>>> 
>>>>>> 
>>>>>> --
>>>>>> [EMAIL PROTECTED]
>>>>>> http://www.overbyte.be
>>>>>> 
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>> From: "Arno Garrels" <[EMAIL PROTECTED]>
>>>>>> To: "ICS support mailing" <twsocket@elists.org>
>>>>>> Sent: Thursday, October 09, 2008 5:26 PM
>>>>>> Subject: Re: [twsocket] HTML encoding in HttpSrv func.
>>>>>> TextToHtmlText()
>>>>>> 
>>>>>> 
>>>>>>> Francois Piette wrote:
>>>>>>>>> Yes, if someone has Apache or a newer IIS installed he could
>>>>>>>>> help. Create a file name with characters not in current ANSI
>>>>>>>>> code page by copy those characters from the Windows
>>>>>>>>> application charmap.exe. Than start a packet sniffer and log
>>>>>>>>> a directory listing.
>>>>>>>> 
>>>>>>>> Using IIS6 on W2K3.
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> 
>>>>>>>> The twothird character (U+2154) is sent in the dirlist as 3
>>>>>>>> characters : 0xE2 0x85 0x94. In the href link, the 3 characters
>>>>>>>> are expressed as %e2%85%94
>>>>>>> 
>>>>>>> That's UTF-8 URL-encoded.
>>>>>>> 
>>>>>>>> while they are binary in the text itself.
>>>>>>> 
>>>>>>> The twothird character is not 'encoded' either as "&#8532;"
>>>>>>> (decimal) or as "&#x2154;" (hex)? If so, IIS sends plain UTF-16!
>>>>>>> 
>>>>>>>> There is nothing in the html header to tell which code page or
>>>>>>>> charset is used. --
>>>>>>> 
>>>>>>> Browsers seem to be very good in detecting the correct character
>>>>>>> set nowadays.
>>>>>>> 
>>>>>>> --
>>>>>>> Arno Garrels
-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Reply via email to