karl wrote:
I have text output coming from a database and ' (apostrophes) are shown in the browser (IE6) as ? (question marks).

There's apostrophes and there are apostrophes. There's ASCII code 39, there's Windows code page 1252 code 146, there's Unicode code <mumble>.... The question is, which of these codes are in your database? You must know the answer to that question before you can decide how to proceed.


Character code handling in the database/Apache::ASP/Perl5/Apache/browser chain is stranger than you probably expect. Here's a post I wrote a few months back detailing two chains I've personally observed:

        http://www.mail-archive.com/[EMAIL PROTECTED]/msg01952.html

Notice that I saw two rather different translation chains on my two test systems! Your particular configuration is quite different from either of mine, so it could give yet a third path.

The only thing I can figure out is that original output shows up as encoded Unicode (UTF-8) in the browser;

Don't guess, find out.

The way I did the analysis to make that post I linked to, I dumped the text in question to a file at several places along the I/O chain, then I examined each file. You should also use a network sniffer to see what the HTTP headers and HTML data are without the browser getting in the way. There's a good list of sniffers in the Winsock Programmer's FAQ, if you don't have one already:

        http://tangentsoft.net/wskfaq/

I think you'll find, as I did, that your characters are being translated back and forth between ISO 8859-x and Unicode multiple times, and that the last step isn't being done correctly.

That last step is critical because of the high probability that the intermediate transformations are all lossless in your situation. All you have to do is communicate to the browser what the final character encoding is. In my particular situation, I had to change an Apache setting to make it send a header informing the browser that the character encoding was UTF-8. The browser was then able to display the web page correctly, nevermind that the data was stored as ISO 8859-1 (Latin-1) in the database, and translated back and forth several times along the path.

The only physical difference I can find between the output generated by Apache::ASP and IIS/ASP is that the Apache::ASP has Unix style LF line-endings and the IIS/ASP has DOS/Windows style CRLF line-endings.

I'll bet you didn't compare the HTTP headers. Different web servers, hence different headers, hence different browser interpretation.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to