Graham TerMarsch wrote:
> I've run into an issue on one of the projects that I'm working on and thought 
> that I'd ping the list to see how others are handling this...

Lucky you, I just spent a few weeks fighting with this as $work and on Krang :)

> The app accepts form data from the user, runs it through Data::FormValidator 
> to validate it, then stuffs it into our PostgreSQL database.  We're expecting 
> users are going to cut/paste from MS-Word and as a result we're going to have 
> to deal with MS "smart quotes".
> 
> My issue started with a DB error from DBD::Pg telling me that the input had 
> an 
> invalid byte sequence for UTF-8 (the tables in Pg are all encoded as UTF-8).  
> Googling around brought me several possible solutions, but I can't say that 
> I've found one yet that actually -works-.

The only thing that will really work is to go with one character set all the way
through. I'd recommend UTF-8 cause if you do, you'll never have to change when
users want to do something that ISO-8859-1 or CP-1252 can't do. And UTF-8 can do
everything. I will warn you that if you go down the UTF-8 route, because UTF-8
can have multibyte characters there's no magic switch to press. It's making your
application know about UTF-8 all the way through.

You need to do all of the following:

+ Tell the browser that the forms/pages are UTF-8 (using HTTP headers and <meta>
tags)
+ When the form data comes in, decode_utf8() it. If you're using CGI.pm you'll
need to use 3.30 which hasn't been released (you can find it on RT) cause it has
some UTF-8 fixes.
+ When doing DB pull/push you'll need to tell the database that the data is in
UTF-8. In MySQL it's done with the 'mysql_enable_utf8' flag on the database 
handle.
+ If you're doing any file IO which may produce or read UTF-8 then you'll need
to make sure that your calls are using the IO layer magic syntax.

The biggest help for me was reading the perluniintro and perlunicode perldoc 
pages.

-- 
Michael Peters
Developer
Plus Three, LP


---------------------------------------------------------------------
Web Archive:  http://www.mail-archive.com/[email protected]/
              http://marc.theaimsgroup.com/?l=cgiapp&r=1&w=2
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to