[GENERAL] 8.0, UTF8, and CLIENT_ENCODING

2007-05-17 Thread Paul Ramsey
I have a small database (PgSQL 8.0, database encoding UTF8) that folks are inserting into via a web form. The form itself is declared ISO-8859-1 and the prior to inserting any data, pg_client_encoding is set to LATIN1. Most of the high-bit characters are correctly translated from LATIN1 to

Re: [GENERAL] 8.0, UTF8, and CLIENT_ENCODING

2007-05-17 Thread Hannes Dorbath
Paul Ramsey wrote: I have a small database (PgSQL 8.0, database encoding UTF8) that folks are inserting into via a web form. The form itself is declared ISO-8859-1 and the prior to inserting any data, pg_client_encoding is set to LATIN1. Most of the high-bit characters are correctly

Re: [GENERAL] 8.0, UTF8, and CLIENT_ENCODING

2007-05-17 Thread PFC
I have a small database (PgSQL 8.0, database encoding UTF8) that folks are inserting into via a web form. The form itself is declared ISO-8859-1 and the prior to inserting any data, pg_client_encoding is set to LATIN1. Wouldn't it be simpler to have the browser submit the form

Re: [GENERAL] 8.0, UTF8, and CLIENT_ENCODING

2007-05-17 Thread Michael Glaesemann
On May 17, 2007, at 16:47 , PFC wrote: and put that in the form. Instead of being mapped to 2-byte UTF8 high-bit equivalents, they are going into the database directly as one-byte values 127. That is, as illegal UTF8 values. Sometimes you also get HTML entities in the mix. Who

Re: [GENERAL] 8.0, UTF8, and CLIENT_ENCODING

2007-05-17 Thread Paul Ramsey
Thanks all for the information. Summary is: - 8.0 wasn't very strict, and allowed the illegal values in, instead of mapping them over into UTF-8 space - the values can be stripped with iconv -c - 8.2 should be more strict I'm in the midst of my upgrade to 8.2 now, hopefully the LATIN1-UTF8