James G. Sack (jim) wrote:
> Néstor wrote:
>> I made a mistake, on my linux's FF it displays as "World,s Largest"
>> but on W2K's FF it displays as "World?s Largest"  <-- the question mark is
>> actually a question mark inside a black diamond box.
>>
>> I can tell you that the tar backup was done in a 32 bit machine and I
>> untarred it in a 64 bit machine.
>>
> 
> It sounds like a difference in locale settings. For example one machine
> (Lx) has latin-1 (or utf_8, or ??), the other (W2k) has utf_16 (I think).
> 
> What do you mean by "host backups" -- maybe the actual files will
> suggest further ideas. What application was used to create the source
> files -- html content, I'm guessing.
> 
> Are you are reading untarred html from file://localhost/... locations,
> perhaps? Can you extract a portion (of the actual file, not from the
> browser) containing the funny-stuff in binary -- or maybe hexdump, which
> you can post. Also what does locale report on your Lx system.
> 
> tar (or gzip/bzip) does no file content conversions. So when reading
> content encoded 'somewhere else', you just may need to convert things.
> 
> You may find
>   man iconv
> helpful.
> 
> Or perhaps, a google search on
>   text encoding conversion
> 

Further thought suggests that what you call <92> might be hex: 0x92
(decimal 146) is latin-1 (and close relatives, even including the
windows CP-1252 deviant) encoding for
 &rsquo; right single quotation mark

This may then be an example of html generated from word or frontpage,
which (I think) uses "smart quotes" that everybody complains about.
Others will know more about this than I do.

In any case, your browser may render it correctly if you set character
encoding to (8859-1, 8859-15, or CP1252) in view encoding. You may also
wish to look at the "page info" dialog for encoding and content-type in
the meta-tags box. It's possible that browsing from file:/// gives
different results than from an actual server, because the server sends a
content-type in the http header, whereas that is missing in access from
a file:///  method.

The underlying problem is probably that the html source should include a
encoding meta-tag. Where did the source come from?

Regards,
..jim


-- 
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list

Reply via email to