On 10/18/07, Graham TerMarsch <[EMAIL PROTECTED]> wrote:
> I've run into an issue on one of the projects that I'm working on and thought
> that I'd ping the list to see how others are handling this...
>
> The app accepts form data from the user, runs it through Data::FormValidator
> to validate it, then stuffs it into our PostgreSQL database.  We're expecting
> users are going to cut/paste from MS-Word and as a result we're going to have
> to deal with MS "smart quotes".
>
> My issue started with a DB error from DBD::Pg telling me that the input had an
> invalid byte sequence for UTF-8 (the tables in Pg are all encoded as UTF-8).
> Googling around brought me several possible solutions, but I can't say that
> I've found one yet that actually -works-.
>
> What I'd like the solution to do is:
> a) provide me a means of encoding/marking the data so that I can insert it
> into our Pg database without it throwing an error,

PostgreSQL will do the character conversion for you, as long as you
tell it what character set you are submitting the data in.

SET CLIENT_ENCODING TO 'WIN1256';

If you want to set this globally for your system, you can set the
PGCLIENTENCODING environment variable.  I set that in apache, so all
my web apps by default use LATIN1 encoding.

SetEnv PGCLIENTENCODING LATIN1

So if your database is set to store values in UTF-8, PostgreSQL will
convert all input from latin1 to UTF-8 before it stores it in the DB.
And when you retrieve results from the DB, it will convert the UTF-8
back to latin1 before it gives it back to you.

See http://www.postgresql.org/docs/7.4/interactive/multibyte.html#AEN18394

Cheers,

Cees

---------------------------------------------------------------------
Web Archive:  http://www.mail-archive.com/[email protected]/
              http://marc.theaimsgroup.com/?l=cgiapp&r=1&w=2
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to