[twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
In function TextToHtmlText() the HTML encoding of characters above #127 assumes code page iso-8859-1. const HtmlSpecialChars : array [160..255] of String[6] = ( 'nbsp' , { #160 no-break space = non-breaking space } 'iexcl' , { #161 inverted exclamation mark

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Francois Piette
Or am I missing something? I think so. Using html entities make sure the correct character is represented whatever the character set or character code is used by the browser. The character code shown in the comments is just for reference only and is only valid on some platforms. -- [EMAIL

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
DZ-Jay wrote: Actually, I think Arno is correct, but it's a bit more complex than that: The entities conversion depend strictly on the local character set. That is, each character set *may* map differently (as Arno just discovered for the cent character between CP-1252 and CP-1251); there

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
Francois Piette wrote: In your example, char #162 is replaced by cent; in the html output. This represent the cnet character whatever the code page is. Actually that is the bug, since #162 is the cent sign in CP 1252 but not in CP 1251. This function is used to generate directory listings,

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
Arno Garrels wrote: Francois Piette wrote: In your example, char #162 is replaced by cent; in the html output. This represent the cnet character whatever the code page is. Actually that is the bug, since #162 is the cent sign in CP 1252 but not in CP 1251. This function is used to

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
Fastream Technologies wrote: IIS5.1 is very old code (2001). Unfortunately my IIS7 Windows 2008 expired so I cannot check right now. Maybe somebody else can help?? Yes, if someone has Apache or a newer IIS installed he could help. Create a file name with characters not in current ANSI code

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread DZ-Jay
Actually, I think Arno is correct, but it's a bit more complex than that: The entities conversion depend strictly on the local character set. That is, each character set *may* map differently (as Arno just discovered for the cent character between CP-1252 and CP-1251); there is no universal

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Francois Piette
Using html entities make sure the correct character is represented whatever the character set or character code is used by the browser. That's correct, but the server maps the wrong HTML entities if it doesn't run in a locale that uses CP 1252! For example: Currently char #162 is hard

[twsocket] [FTP] ESocketException

2008-10-09 Thread Guillaume ROQUES
Hi, I use the FTP Client Component in 2 of my applications : 1-Just a tool to manage the FTP connections 2-A window service which check, by FTP, if some defined files exists. In my Tool (1), I could test my connections and it work well (similar to the FtpCLi example ;p) In my service (2), the

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Francois Piette
In your example, char #162 is replaced by cent; in the html output. This represent the cnet character whatever the code page is. Actually that is the bug, since #162 is the cent sign in CP 1252 but not in CP 1251. This function is used to generate directory listings, most file names

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Francois Piette
Yes, if someone has Apache or a newer IIS installed he could help. Create a file name with characters not in current ANSI code page by copy those characters from the Windows application charmap.exe. Than start a packet sniffer and log a directory listing. Using IIS6 on W2K3. The twothird

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
Francois Piette wrote: In your example, char #162 is replaced by cent; in the html output. This represent the cnet character whatever the code page is. Actually that is the bug, since #162 is the cent sign in CP 1252 but not in CP 1251. This function is used to generate directory listings,

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
Francois Piette wrote: Or am I missing something? I think so. Using html entities make sure the correct character is represented whatever the character set or character code is used by the browser. That's correct, but the server maps the wrong HTML entities if it doesn't run in a locale

Re: [twsocket] [FTP] ESocketException

2008-10-09 Thread Arno Garrels
I don't know how to understand this error, as my connections with the component work in my tool (1) It's a Winsock 2 error code: WSASYSCALLFAILURE 10107 System call failure. A system call that should never fail has failed. This is a generic error code, returned under various

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
Francois Piette wrote: Yes, if someone has Apache or a newer IIS installed he could help. Create a file name with characters not in current ANSI code page by copy those characters from the Windows application charmap.exe. Than start a packet sniffer and log a directory listing. Using IIS6

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Francois PIETTE
The twothird character is not 'encoded' either as #8532; (decimal) or as #x2154; (hex)? If so, IIS sends plain UTF-16! Yes, no encoding at all. Just the 3 bytes. So UTF-16. -- [EMAIL PROTECTED] http://www.overbyte.be - Original Message - From: Arno Garrels [EMAIL PROTECTED] To:

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Arno Garrels
Francois PIETTE wrote: The twothird character is not 'encoded' either as #8532; (decimal) or as #x2154; (hex)? If so, IIS sends plain UTF-16! Yes, no encoding at all. Just the 3 bytes. So UTF-16. But 3 bytes looks like UTF-8 ? -- Arno Garrels -- [EMAIL PROTECTED]

Re: [twsocket] HTML encoding in HttpSrv func. TextToHtmlText()

2008-10-09 Thread Francois PIETTE
But 3 bytes looks like UTF-8 ? I don't know. You said it was UTF-16 if not encoded. - Original Message - From: Arno Garrels [EMAIL PROTECTED] To: ICS support mailing twsocket@elists.org Sent: Thursday, October 09, 2008 7:03 PM Subject: Re: [twsocket] HTML encoding in HttpSrv func.