I have a small database (PgSQL 8.0, database encoding UTF8) that folks
are inserting into via a web form. The form itself is declared
ISO-8859-1 and the prior to inserting any data, pg_client_encoding is
set to LATIN1.
Most of the high-bit characters are correctly translated from LATIN1 to
Paul Ramsey wrote:
I have a small database (PgSQL 8.0, database encoding UTF8) that folks
are inserting into via a web form. The form itself is declared
ISO-8859-1 and the prior to inserting any data, pg_client_encoding is
set to LATIN1.
Most of the high-bit characters are correctly
I have a small database (PgSQL 8.0, database encoding UTF8) that folks
are inserting into via a web form. The form itself is declared
ISO-8859-1 and the prior to inserting any data, pg_client_encoding is
set to LATIN1.
Wouldn't it be simpler to have the browser submit the form
On May 17, 2007, at 16:47 , PFC wrote:
and put that in the form. Instead of being mapped to 2-byte UTF8
high-bit equivalents, they are going into the database directly as
one-byte values 127. That is, as illegal UTF8 values.
Sometimes you also get HTML entities in the mix. Who
Thanks all for the information. Summary is:
- 8.0 wasn't very strict, and allowed the illegal values in, instead
of mapping them over into UTF-8 space
- the values can be stripped with iconv -c
- 8.2 should be more strict
I'm in the midst of my upgrade to 8.2 now, hopefully the LATIN1-UTF8