Re: UTF-8 display (was: Re: a mug)

Marcel Schneider Tue, 21 Jul 2015 06:49:36 -0700

On 21 Jul 2015, at 14;49, philip chastney 
wrote:

> so the webmaster put up the page, declaring the charset to be UTF-8...
> 
> but what charset was being used by the guy who knocked out the HTML?
> 
> it could be more complicated than that: maybe the page was produced using 
> UTF-8, 
> somebody reads the page using, say, WIndows 1252, and "converts" it to UTF-8
> 
> I'm sure, with a little effort, ever more complicated scenarii could be 
> constructed
> -- it's amazing what can be achieved when arrogance and ignorance are combined



I fear things have grown somewhat upside down, so I'll try to outline the real 
scenario:

1 - I open the page, the horizontal ellipsis is displayed as â€¦ (of course I 
don't know yet that it's a horizontal ellipsis...).
2 - I remember my comment about the T-shirt and decide to check whether it's 
accurate. Firefox shows me the page is in UTF-8 and that there is nothing after 
"Our apologies".
3 - After some trial and error, I save the page in Zotero and open the folder. 
The only HTML file inside is declared as Windows-1252, and there is the 
horizontal ellipsis.
4 - I back up the original file, try modifying the charset value to utf-8 and 
refresh the page, the â€¦ converts to a horizontal ellipsis.

To answer your questions, I figure out that the page was written on a 
Windows-1252 template but without sticking with this charset. U+2026 was 
probably an autocorrect. So it was "produced using UTF-8" but "the webmaster" 
must have published it under the old charset.

The puzzling point is that Firefox tried UTF-8 and told me he's serious, but 
"ate" the U+2026 while it used the native Windows-1252 to "display" it...

I hope that some macro could enable the "webmasters" to rapidly update 
websites, because resolving this "funny" "scenario" has cost me some "effort" 
today!

Marcel

Re: UTF-8 display (was: Re: a mug)

Reply via email to