Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-10 Thread Martin Schäfer
Thanks Andrew. I will test the next release. Martin -Original Message- From: Andrew Dunstan [mailto:and...@dunslane.net] Sent: 08 June 2013 16:43 To: Tom Lane Cc: Heikki Linnakangas; k...@rice.edu; Martin Schäfer; pgsql- hack...@postgresql.org Subject: Re: [HACKERS] UTF-8 encoding

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-10 Thread Heikki Linnakangas
On 04.06.2013 09:39, Martin Schäfer wrote: Can't really blame Windows on that. On Windows, we don't require that the encoding and LC_CTYPE's charset match. The OP used UTF-8 encoding in the server, but LC_CTYPE=English_United Kingdom.1252, ie. LC_CTYPE implies WIN1252 encoding. We allow that and

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-08 Thread Andrew Dunstan
On 06/03/2013 02:41 PM, Andrew Dunstan wrote: On 06/03/2013 02:28 PM, Tom Lane wrote: . I wonder though if we couldn't just fix this code to not do anything to high-bit-set bytes in multibyte encodings. That's exactly what I suggested back in November. This thread seems to have gone

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-04 Thread Martin Schäfer
Can't really blame Windows on that. On Windows, we don't require that the encoding and LC_CTYPE's charset match. The OP used UTF-8 encoding in the server, but LC_CTYPE=English_United Kingdom.1252, ie. LC_CTYPE implies WIN1252 encoding. We allow that and it generally works on Windows because

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread k...@rice.edu
On Mon, Jun 03, 2013 at 03:40:14PM +0100, Martin Schäfer wrote: I try to create database columns with umlauts, using the UTF8 client encoding. However, the server seems to mess up the column names. In particular, it seems to perform a lowercase operation on each byte of the UTF-8 multi-byte

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Martin Schäfer
-Original Message- From: k...@rice.edu [mailto:k...@rice.edu] Sent: 03 June 2013 16:48 To: Martin Schäfer Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] UTF-8 encoding problem w/ libpq On Mon, Jun 03, 2013 at 03:40:14PM +0100, Martin Schäfer wrote: I try to create

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread k...@rice.edu
On Mon, Jun 03, 2013 at 04:09:29PM +0100, Martin Schäfer wrote: If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is already in lowercase, so I think it should also work without quoting the column name. Am I

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Heikki Linnakangas
On 03.06.2013 18:27, k...@rice.edu wrote: On Mon, Jun 03, 2013 at 04:09:29PM +0100, Martin Schäfer wrote: If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is already in lowercase, so I think it should also work

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Andrew Dunstan
On 06/03/2013 12:22 PM, Heikki Linnakangas wrote: On 03.06.2013 18:27, k...@rice.edu wrote: On Mon, Jun 03, 2013 at 04:09:29PM +0100, Martin Schäfer wrote: If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes: He *is* using UTF-8. Or trying to, anyway :-). The downcasing in the backend is supposed to leave bytes with the high-bit set alone, ie. in UTF-8 encoding, it's supposed to leave ä and ß alone. Well, actually,

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Andrew Dunstan
On 06/03/2013 02:28 PM, Tom Lane wrote: . I wonder though if we couldn't just fix this code to not do anything to high-bit-set bytes in multibyte encodings. That's exactly what I suggested back in November. cheers andrew -- Sent via pgsql-hackers mailing list

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Heikki Linnakangas
On 03.06.2013 21:28, Tom Lane wrote: Heikki Linnakangashlinnakan...@vmware.com writes: He *is* using UTF-8. Or trying to, anyway :-). The downcasing in the backend is supposed to leave bytes with the high-bit set alone, ie. in UTF-8 encoding, it's supposed to leave ä and ß alone. Well,