Graham TerMarsch wrote: > I've run into an issue on one of the projects that I'm working on and thought > that I'd ping the list to see how others are handling this...
Lucky you, I just spent a few weeks fighting with this as $work and on Krang :) > The app accepts form data from the user, runs it through Data::FormValidator > to validate it, then stuffs it into our PostgreSQL database. We're expecting > users are going to cut/paste from MS-Word and as a result we're going to have > to deal with MS "smart quotes". > > My issue started with a DB error from DBD::Pg telling me that the input had > an > invalid byte sequence for UTF-8 (the tables in Pg are all encoded as UTF-8). > Googling around brought me several possible solutions, but I can't say that > I've found one yet that actually -works-. The only thing that will really work is to go with one character set all the way through. I'd recommend UTF-8 cause if you do, you'll never have to change when users want to do something that ISO-8859-1 or CP-1252 can't do. And UTF-8 can do everything. I will warn you that if you go down the UTF-8 route, because UTF-8 can have multibyte characters there's no magic switch to press. It's making your application know about UTF-8 all the way through. You need to do all of the following: + Tell the browser that the forms/pages are UTF-8 (using HTTP headers and <meta> tags) + When the form data comes in, decode_utf8() it. If you're using CGI.pm you'll need to use 3.30 which hasn't been released (you can find it on RT) cause it has some UTF-8 fixes. + When doing DB pull/push you'll need to tell the database that the data is in UTF-8. In MySQL it's done with the 'mysql_enable_utf8' flag on the database handle. + If you're doing any file IO which may produce or read UTF-8 then you'll need to make sure that your calls are using the IO layer magic syntax. The biggest help for me was reading the perluniintro and perlunicode perldoc pages. -- Michael Peters Developer Plus Three, LP --------------------------------------------------------------------- Web Archive: http://www.mail-archive.com/[email protected]/ http://marc.theaimsgroup.com/?l=cgiapp&r=1&w=2 To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
