On Fri, Dec 26, 2008 at 15:03, John Refior <jref...@gmail.com> wrote: snip > The problem I am having is that a number of these webpages have special > multibyte characters on them, such as the trademark symbol and registered > trademark symbol. For example, in the CSV, the trademark (TM) symbol > shows up like > > â„¢ > > Now that's fine in a way, because if I redisplay them on a webpage with > <meta charset='utf-8'>, Firefox and Internet Explorer display them as > intended. snip
The file is already in UTF-8, otherwise it wouldn't display properly in Firefox or IE. The problem is either your display or perl doesn't know that the file is in UTF-8. The first step is make sure Perl knows it is working with UTF-8. Add export PERL_UNICODE=SDL to your .profile, .bashrc, or whatever you use for your profile. Logout and log back in. This tells perl to use UTF-8 for STDIN, STDOUT, and STDERR (the S), input and output streams (the D), and all of it dependent on locale (the L). The next thing to check is the value in your LANG environment variable. It should be something like en_US.UTF-8. If you are still having problems check to see if your terminal is expecting something other than UTF-8 (this is highly dependent on the terminal, so you will need to tell us what terminal you are using). -- Chas. Owens wonkden.net The most important skill a programmer can have is the ability to read.