On 10/18/07, Graham TerMarsch <[EMAIL PROTECTED]> wrote: > I've run into an issue on one of the projects that I'm working on and thought > that I'd ping the list to see how others are handling this... > > The app accepts form data from the user, runs it through Data::FormValidator > to validate it, then stuffs it into our PostgreSQL database. We're expecting > users are going to cut/paste from MS-Word and as a result we're going to have > to deal with MS "smart quotes". > > My issue started with a DB error from DBD::Pg telling me that the input had an > invalid byte sequence for UTF-8 (the tables in Pg are all encoded as UTF-8). > Googling around brought me several possible solutions, but I can't say that > I've found one yet that actually -works-. > > What I'd like the solution to do is: > a) provide me a means of encoding/marking the data so that I can insert it > into our Pg database without it throwing an error,
PostgreSQL will do the character conversion for you, as long as you tell it what character set you are submitting the data in. SET CLIENT_ENCODING TO 'WIN1256'; If you want to set this globally for your system, you can set the PGCLIENTENCODING environment variable. I set that in apache, so all my web apps by default use LATIN1 encoding. SetEnv PGCLIENTENCODING LATIN1 So if your database is set to store values in UTF-8, PostgreSQL will convert all input from latin1 to UTF-8 before it stores it in the DB. And when you retrieve results from the DB, it will convert the UTF-8 back to latin1 before it gives it back to you. See http://www.postgresql.org/docs/7.4/interactive/multibyte.html#AEN18394 Cheers, Cees --------------------------------------------------------------------- Web Archive: http://www.mail-archive.com/[email protected]/ http://marc.theaimsgroup.com/?l=cgiapp&r=1&w=2 To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
