Re: [GENERAL] 8.0, UTF8, and CLIENT_ENCODING

Paul Ramsey Thu, 17 May 2007 16:59:03 -0700

Thanks all for the information. Summary is:

- 8.0 wasn't very strict, and allowed the illegal values in, insteadof mapping them over into UTF-8 space

- the values can be stripped with iconv -c
- 8.2 should be more strict

I'm in the midst of my upgrade to 8.2 now, hopefully the LATIN1->UTF8conversion will now map the odd characters cleanly into UTF space.


On 17-May-07, at 3:25 PM, Michael Glaesemann wrote:

On May 17, 2007, at 16:47 , PFC wrote:
and put that in the form. Instead of being mapped to 2-byte UTF8high-bit equivalents, they are going into the database directlyas one-byte values > 127. That is, as illegal UTF8 values.
        Sometimes you also get HTML entities in the mix. Who knows.
All my web forms are UTF-8 back to back, it just works. Was Ilucky ?Normally postgres rejects illegal UTF8 values, you wouldn't beable to insert them...
8.0 and earlier weren't quite as strict as it should have been. Seethe note at the end of the migration instuctions in the releasenotes for 8.1[1] That may have been part of the issue here.
Michael Glaesemann
grzm seespotcode net
[1](http://www.postgresql.org/docs/8.2/interactive/release-8-1.html#AEN80196)



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [GENERAL] 8.0, UTF8, and CLIENT_ENCODING

Reply via email to