Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Peter Eisentraut
Am Donnerstag, 24. August 2006 00:52 schrieb Tom Lane:
 A possible solution therefore is to have psql or libpq drive the
 client_encoding off the client's locale environment instead of letting
 it default to equal the server_encoding.

I got started on this and just wanted to post an intermediate patch.  I have 
taken the logic from initdb and placed it into libpq and refined the API a 
bit.  At this point, there should be no behaviorial change.  It remains to 
make libpq use this stuff if PGCLIENTENCODING is not set.  Unless someone 
beats me, I'll figure that out later.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


codeset-refactor.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Martijn van Oosterhout
On Fri, Aug 25, 2006 at 05:07:03PM +0200, Peter Eisentraut wrote:
 I got started on this and just wanted to post an intermediate patch.  I have 
 taken the logic from initdb and placed it into libpq and refined the API a 
 bit.  At this point, there should be no behaviorial change.  It remains to 
 make libpq use this stuff if PGCLIENTENCODING is not set.  Unless someone 
 beats me, I'll figure that out later.

Umm, why export all these functions. For starters, does this even need
to be in libpq? I wouldn't have thought so the first time round,
especially not three functions. The only thing you need is to take a
locale name and return the charset you can pass to PQsetClientEncoding.

In fact, the only thing you need is PQsetClientEncodingFromLocale(),
anything else is just sugar. Why would the user care about what the OS
calls it? We have a pg_enc enum, so lets use it.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Peter Eisentraut
Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout:
 Umm, why export all these functions. For starters, does this even need
 to be in libpq?

Where else would you put it?

 In fact, the only thing you need is PQsetClientEncodingFromLocale(),
 anything else is just sugar. Why would the user care about what the OS
 calls it? We have a pg_enc enum, so lets use it.

initdb has different requirements.  Let me know if you have a different way to 
refactor it that satisfies initdb.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Martijn van Oosterhout
On Fri, Aug 25, 2006 at 05:38:20PM +0200, Peter Eisentraut wrote:
  In fact, the only thing you need is PQsetClientEncodingFromLocale(),
  anything else is just sugar. Why would the user care about what the OS
  calls it? We have a pg_enc enum, so lets use it.
 
 initdb has different requirements.  Let me know if you have a different way 
 to 
 refactor it that satisfies initdb.

Well, check_encodings_match(pg_enc,ctype) is simply a short way of
saying: if(find_matching_encoding(ctype) != pg_enc ) { error }.
And get_encoding_from_locale() is not used outside of those functions.

So the only thing initdb actually needs is an implementation of
find_matching_encoding(ctype), which returns a value of enum pg_enc.
check_encodings_match() stays in initdb, and get_encoding_from_locale()
becomes internal to libpq.

How does that sound?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout:
 Umm, why export all these functions. For starters, does this even need
 to be in libpq?

 Where else would you put it?
 ...
 initdb has different requirements.  Let me know if you have a different way 
 to 
 refactor it that satisfies initdb.

Um, but initdb doesn't use libpq, so it's going to need its own copy
anyway.  I agree with Martijn that putting these into libpq's API
seems like useless clutter.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Peter Eisentraut
Tom Lane wrote:
 Um, but initdb doesn't use libpq, so it's going to need its own copy
 anyway.

initdb certainly links against libpq.

 I agree with Martijn that putting these into libpq's API 
 seems like useless clutter.

Where else to put it?  We need it in libpq anyway if we want this 
behavior in all client applications (by default).

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I agree with Martijn that putting these into libpq's API 
 seems like useless clutter.

 Where else to put it?  We need it in libpq anyway if we want this 
 behavior in all client applications (by default).

Having the code in libpq doesn't necessarily mean exposing it to the
outside world.  I can't see a reason for these to be in the API at all.

Possibly we could avoid the duplication-of-source-code issue by putting
the code in libpgport, or someplace, whence both initdb and libpq could
get at it?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Martijn van Oosterhout
On Fri, Aug 25, 2006 at 08:13:39PM +0200, Peter Eisentraut wrote:
  I agree with Martijn that putting these into libpq's API 
  seems like useless clutter.
 
 Where else to put it?  We need it in libpq anyway if we want this 
 behavior in all client applications (by default).

Is that so? I thought we were only talkng about psql. Even then, I'm
wondering if we should alter the current behaviour at all if stdout is
not a tty (i.e. run as a pipe).

And as a counter-example: pg_dump should absolutly not use the client
locale, it should always dump as the same encoding as the server...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 And as a counter-example: pg_dump should absolutly not use the client
 locale, it should always dump as the same encoding as the server...

Sure, but pg_dump should set that explicitly.  I'm prepared to believe
that looking at the locale is sane for all normal clients.

It might be worth providing a way to set the client_encoding through a
PQconnectdb connection-string keyword, just in case the override-via-
PGCLIENTENCODING dodge doesn't suit someone.  The priority order
would presumably be connection string, then PGCLIENTENCODING, then
locale.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] invalid byte sequence ?

2006-08-25 Thread Alvaro Herrera
Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  And as a counter-example: pg_dump should absolutly not use the client
  locale, it should always dump as the same encoding as the server...
 
 Sure, but pg_dump should set that explicitly.  I'm prepared to believe
 that looking at the locale is sane for all normal clients.

What are normal clients?  I would think that programs in PHP or Perl
have their own idea of the correct encoding (JDBC already has one).

 It might be worth providing a way to set the client_encoding through a
 PQconnectdb connection-string keyword, just in case the override-via-
 PGCLIENTENCODING dodge doesn't suit someone.  The priority order
 would presumably be connection string, then PGCLIENTENCODING, then
 locale.

This sounds like a good idea anyway...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings