Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-22 Thread Hiroshi Inoue

Magnus Hagander wrote:

Peter Eisentraut wrote:

Magnus Hagander wrote:

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true?

Yes, the non-WIN32 code path appears to be the same, still.  But the
ifdef WIN32 part we don't want, because that presumes something about
the spelling of encoding names in the local iconv library.


If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.

Yes, that would be much better.


Something like this then?


It seems OK to me.

regards,
Hiroshi Inoue



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-22 Thread Magnus Hagander
Hiroshi Inoue wrote:
 Magnus Hagander wrote:
 Peter Eisentraut wrote:
 Magnus Hagander wrote:
 However, one question: The comment currently says it's harmless to do
 this on non-windows platforms. Does this still hold true?
 Yes, the non-WIN32 code path appears to be the same, still.  But the
 ifdef WIN32 part we don't want, because that presumes something about
 the spelling of encoding names in the local iconv library.

 If we do keep the thing win32 only, I think we should just wrap the
 whole thing in #ifdef WIN32 and no longer do the codeset stuff at
 all on
 Unix - that'll make for cleaner code.
 Yes, that would be much better.

 Something like this then?
 
 It seems OK to me.

Applied.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Magnus Hagander
Hiroshi Inoue wrote:
 Hiroshi Inoue wrote:
 Magnus Hagander wrote:
 Hiroshi Inoue wrote:
 Hiroshi Inoue wrote:
 Bruce Momjian wrote:
 Hiroshi, is this patch still needed?
 Yes though it should be slightly changed now.

 In what way should it be changed?

 One is already committed by you.
  [COMMITTERS] pgsql: Use the new text domain names

 Another is to bind the codeset EUC-JP for
  PG_EUC_JIS_2004 server encoding.
 
 The attached is an updated patch.

Thanks.

Looking at it, the comment clearly needs updating - I'll do that.

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true? In that case,
this whole thing shouldn't be #ifdef:ed to WIN32 and can be simplified.
Or does the middle part of the comment come into play, in that the
codeset names can be different on different platforms?

Peter, can you comment on that?

If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Peter Eisentraut

Magnus Hagander wrote:

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true?


Yes, the non-WIN32 code path appears to be the same, still.  But the 
ifdef WIN32 part we don't want, because that presumes something about 
the spelling of encoding names in the local iconv library.



If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.


Yes, that would be much better.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Magnus Hagander
Peter Eisentraut wrote:
 Magnus Hagander wrote:
 However, one question: The comment currently says it's harmless to do
 this on non-windows platforms. Does this still hold true?
 
 Yes, the non-WIN32 code path appears to be the same, still.  But the
 ifdef WIN32 part we don't want, because that presumes something about
 the spelling of encoding names in the local iconv library.
 
 If we do keep the thing win32 only, I think we should just wrap the
 whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
 Unix - that'll make for cleaner code.
 
 Yes, that would be much better.

Something like this then?

//Magnus

*** a/src/backend/utils/mb/mbutils.c
--- b/src/backend/utils/mb/mbutils.c
***
*** 849,854  cliplen(const char *str, int len, int limit)
--- 849,894 
  	return l;
  }
  
+ #if defined(ENABLE_NLS)  defined(WIN32)
+ static const struct codeset_map {
+ 	int	encoding;
+ 	const char *codeset;
+ } codeset_map_array[] = {
+ 	{PG_UTF8, UTF-8},
+ 	{PG_LATIN1, LATIN1},
+ 	{PG_LATIN2, LATIN2},
+ 	{PG_LATIN3, LATIN3},
+ 	{PG_LATIN4, LATIN4},
+ 	{PG_ISO_8859_5, ISO-8859-5},
+ 	{PG_ISO_8859_6, ISO_8859-6},
+ 	{PG_ISO_8859_7, ISO-8859-7},
+ 	{PG_ISO_8859_8, ISO-8859-8},
+ 	{PG_LATIN5, LATIN5},
+ 	{PG_LATIN6, LATIN6},
+ 	{PG_LATIN7, LATIN7},
+ 	{PG_LATIN8, LATIN8},
+ 	{PG_LATIN9, LATIN-9},
+ 	{PG_LATIN10, LATIN10},
+ 	{PG_KOI8R, KOI8-R},
+ 	{PG_WIN1250, CP1250},
+ 	{PG_WIN1251, CP1251},
+ 	{PG_WIN1252, CP1252},
+ 	{PG_WIN1253, CP1253},
+ 	{PG_WIN1254, CP1254},
+ 	{PG_WIN1255, CP1255},
+ 	{PG_WIN1256, CP1256},
+ 	{PG_WIN1257, CP1257},
+ 	{PG_WIN1258, CP1258},
+ 	{PG_WIN866, CP866},
+ 	{PG_WIN874, CP874},
+ 	{PG_EUC_CN, EUC-CN},
+ 	{PG_EUC_JP, EUC-JP},
+ 	{PG_EUC_KR, EUC-KR},
+ 	{PG_EUC_TW, EUC-TW},
+ 	{PG_EUC_JIS_2004, EUC-JP}
+ };
+ #endif /* WIN32 */
+ 
  void
  SetDatabaseEncoding(int encoding)
  {
***
*** 859,880  SetDatabaseEncoding(int encoding)
  	Assert(DatabaseEncoding-encoding == encoding);
  
  	/*
! 	 * On Windows, we allow UTF-8 database encoding to be used with any
! 	 * locale setting, because UTF-8 requires special handling anyway.
! 	 * But this means that gettext() might be misled about what output
! 	 * encoding it should use, so we have to tell it explicitly.
! 	 *
! 	 * In future we might want to call bind_textdomain_codeset
! 	 * unconditionally, but that requires knowing how to spell the codeset
! 	 * name properly for all encodings on all platforms, which might be
! 	 * problematic.
! 	 *
! 	 * This is presently unnecessary, but harmless, on non-Windows platforms.
  	 */
! #ifdef ENABLE_NLS
! 	if (encoding == PG_UTF8)
! 		if (bind_textdomain_codeset(textdomain(NULL), UTF-8) == NULL)
! 			elog(LOG, bind_textdomain_codeset failed);
  #endif
  }
  
--- 899,921 
  	Assert(DatabaseEncoding-encoding == encoding);
  
  	/*
! 	 * On Windows, we need to explicitly bind gettext to the correct
! 	 * encoding, because gettext() tends to get confused.
  	 */
! #if defined(ENABLE_NLS)  defined(WIN32)
! 	{
! 		int	i;
! 
! 		for (i = 0; i  sizeof(codeset_map_array) / sizeof(codeset_map_array[0]); i++)
! 		{
! 			if (codeset_map_array[i].encoding == encoding)
! 			{
! if (bind_textdomain_codeset(textdomain(NULL), codeset_map_array[i].codeset) == NULL)
! 	elog(LOG, bind_textdomain_codeset failed);
! break;
! 			}
! 		}
! 	}
  #endif
  }
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-21 Thread Peter Eisentraut

Magnus Hagander wrote:

Peter Eisentraut wrote:

Magnus Hagander wrote:

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true?

Yes, the non-WIN32 code path appears to be the same, still.  But the
ifdef WIN32 part we don't want, because that presumes something about
the spelling of encoding names in the local iconv library.


If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.

Yes, that would be much better.


Something like this then?


Looks OK to me.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-20 Thread Hiroshi Inoue

Hiroshi Inoue wrote:

Magnus Hagander wrote:

Hiroshi Inoue wrote:

Hiroshi Inoue wrote:

Bruce Momjian wrote:

Hiroshi, is this patch still needed?

Yes though it should be slightly changed now.


In what way should it be changed?


One is already committed by you.
 [COMMITTERS] pgsql: Use the new text domain names

Another is to bind the codeset EUC-JP for
 PG_EUC_JIS_2004 server encoding.


The attached is an updated patch.

regards,
Hiroshi Inoue

Index: mbutils.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/mb/mbutils.c,v
retrieving revision 1.77
diff -c -c -r1.77 mbutils.c
*** mbutils.c   19 Jan 2009 15:34:23 -  1.77
--- mbutils.c   20 Jan 2009 12:54:33 -
***
*** 837,842 
--- 837,881 
return clen;
  }
  
+ #ifdefWIN32
+ static const struct codeset_map {
+   int encoding;
+   const char *codeset;
+ } codeset_map_array[] = {
+   {PG_UTF8, UTF-8},
+   {PG_LATIN1, LATIN1}, 
+   {PG_LATIN2, LATIN2},
+   {PG_LATIN3, LATIN3},
+   {PG_LATIN4, LATIN4},
+   {PG_ISO_8859_5, ISO-8859-5},
+   {PG_ISO_8859_6, ISO_8859-6},
+   {PG_ISO_8859_7, ISO-8859-7},
+   {PG_ISO_8859_8, ISO-8859-8},
+   {PG_LATIN5, LATIN5},
+   {PG_LATIN6, LATIN6},
+   {PG_LATIN7, LATIN7},
+   {PG_LATIN8, LATIN8},
+   {PG_LATIN9, LATIN-9},
+   {PG_LATIN10, LATIN10},
+   {PG_KOI8R, KOI8-R},
+   {PG_WIN1250, CP1250},
+   {PG_WIN1251, CP1251},
+   {PG_WIN1252, CP1252},
+   {PG_WIN1253, CP1253},
+   {PG_WIN1254, CP1254},
+   {PG_WIN1255, CP1255},
+   {PG_WIN1256, CP1256},
+   {PG_WIN1257, CP1257},
+   {PG_WIN1258, CP1258},
+   {PG_WIN866, CP866},
+   {PG_WIN874, CP874},
+   {PG_EUC_CN, EUC-CN},
+   {PG_EUC_JP, EUC-JP},
+   {PG_EUC_KR, EUC-KR},
+   {PG_EUC_TW, EUC-TW},
+   {PG_EUC_JIS_2004, EUC-JP}};
+ #endif /* WIN32 */
+ 
  /* mbcliplen for any single-byte encoding */
  static int
  cliplen(const char *str, int len, int limit)
***
*** 852,857 
--- 891,898 
  void
  SetDatabaseEncoding(int encoding)
  {
+   const char *target_codeset = NULL;
+ 
if (!PG_VALID_BE_ENCODING(encoding))
elog(ERROR, invalid database encoding: %d, encoding);
  
***
*** 873,879 
 */
  #ifdef ENABLE_NLS
if (encoding == PG_UTF8)
!   if (bind_textdomain_codeset(textdomain(NULL), UTF-8) == NULL)
elog(LOG, bind_textdomain_codeset failed);
  #endif
  }
--- 914,935 
 */
  #ifdef ENABLE_NLS
if (encoding == PG_UTF8)
!   target_codeset = UTF-8;
! #ifdefWIN32
!   else
!   {
!   int i;
! 
!   for (i = 0; i  sizeof(codeset_map_array) / sizeof(struct 
codeset_map); i++)
!   if (codeset_map_array[i].encoding == encoding)
!   {
!   target_codeset = codeset_map_array[i].codeset;
!   break;
!   }
!   }
! #endif /* WIN32 */
!   if (target_codeset != NULL)
!   if (bind_textdomain_codeset(textdomain(NULL), target_codeset) 
== NULL)
elog(LOG, bind_textdomain_codeset failed);
  #endif
  }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-19 Thread Magnus Hagander
Hiroshi Inoue wrote:
 Hiroshi Inoue wrote:
 Bruce Momjian wrote:
 Hiroshi, is this patch still needed?

 Yes though it should be slightly changed now.

In what way should it be changed?

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-19 Thread Hiroshi Inoue

Magnus Hagander wrote:

Hiroshi Inoue wrote:

Hiroshi Inoue wrote:

Bruce Momjian wrote:

Hiroshi, is this patch still needed?

Yes though it should be slightly changed now.


In what way should it be changed?


One is already committed by you.
 [COMMITTERS] pgsql: Use the new text domain names

Another is to bind the codeset EUC-JP for
 PG_EUC_JIS_2004 server encoding.
Though EUC_JP and EUC_JIS_2004 aren't completely
 compatible, it seems OK in most cases.

regards,
Hiroshi Inoue


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-11 Thread Hiroshi Inoue

Hiroshi Inoue wrote:

Bruce Momjian wrote:

Hiroshi, is this patch still needed?


Yes though it should be slightly changed now.
*set lc_messages does not work* issue isn't directly related to this
 issue.


Though I'm not sure how we can test it, I can provide test results
like the attached one.

The attached is a test result in case LC_MESSAGES=fr causing
the following 3 errors in databases with various encodings.

  select a; == column a does not exist
  1; == syntax error at or near 1
  select * from a; == relation a does not exist

Comments?
Please note the encoding of the attached file is utf-8.

regards,
Hiroshi Inoue
SET
 current_database 
--
 utf8
(1 row)

 server_encoding 
-
 UTF8
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne « a » n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou près de « 1 »
LINE 1: 1;
^
ERREUR:  la relation « a » n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 eucjp
(1 row)

 server_encoding 
-
 EUC_JP
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou près de  1 
LINE 1: 1;
^
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 euc-jis-2004
(1 row)

 server_encoding 
-
 EUC_JIS_2004
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERROR:  character 0x8fabb2 of encoding EUC_JIS_2004 has no equivalent in 
UTF8
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 euccn
(1 row)

 server_encoding 
-
 EUC_CN
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou près de  1 
LINE 1: 1;
^
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 euctw
(1 row)

 server_encoding 
-
 EUC_TW
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou pr`es de  1 
LINE 1: 1;
^
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 euckr
(1 row)

 server_encoding 
-
 EUC_KR
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou pr`es de  1 
LINE 1: 1;
^
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 latin1
(1 row)

 server_encoding 
-
 LATIN1
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne « a » n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou près de « 1 »
LINE 1: 1;
^
ERREUR:  la relation « a » n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 latin2
(1 row)

 server_encoding 
-
 LATIN2
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou pr`es de  1 
LINE 1: 1;
^
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 latin3
(1 row)

 server_encoding 
-
 LATIN3
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou près de  1 
LINE 1: 1;
^
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 latin4
(1 row)

 server_encoding 
-
 LATIN4
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne  a  n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou pr`es de  1 
LINE 1: 1;
^
ERREUR:  la relation  a  n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 latin5
(1 row)

 server_encoding 
-
 LATIN5
(1 row)

SET
 lc_messages 
-
 fr
(1 row)

ERREUR:  la colonne « a » n'existe pas
LINE 1: select a;
   ^
ERREUR:  erreur de syntaxe sur ou près de « 1 »
LINE 1: 1;
^
ERREUR:  la relation « a » n'existe pas
LINE 1: select * from a;
  ^
SET
 current_database 
--
 latin6
(1 row)

 server_encoding 

Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-10 Thread Hiroshi Saito

Hi.

My swift attack test was the MinGW environment.
But, Inoue-san suggestion.

1. MinGW+gcc build
HIROSHI=# set LC_TIME=Ja;
SET
HIROSHI=# select to_char(now(),'TMDay');
to_char
-
日曜日
(1 行)
HIROSHI=# set LC_TIME='Japan';
SET
HIROSHI=# select to_char(Now(),'TMDay');
to_char
-
日曜日
(1 行)
HIROSHI=# set LC_TIME='Japanese';
SET
HIROSHI=# select to_char(Now(),'TMDay');
to_char
-
日曜日
(1 行)

However,  A setup of 'Ja' was strange.?_?
http://msdn.microsoft.com/en-us/library/aa246450(VS.60).aspx

2. MSVC build
HIROSHI=# set LC_TIME='Ja';
ERROR:  invalid value for parameter lc_time: Ja
STATEMENT:  set LC_TIME='Ja';
ERROR:  invalid value for parameter lc_time: Ja
HIROSHI=# set LC_TIME='Japan';
ERROR:  invalid value for parameter lc_time: Japan
STATEMENT:  set LC_TIME='Japan';
ERROR:  invalid value for parameter lc_time: Japan
HIROSHI=# set LC_TIME=Japanese;
SET
HIROSHI=# select to_char(Now(),'TMDay');
to_char
-
日曜日
(1 行)

Umm, Re-investigation is required for this. :-(
However, If reasonable clear, it will be good for a document at suggestion.

Regards,
Hiroshi Saito 



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-09 Thread Bruce Momjian

Hiroshi, is this patch still needed?

---

Hiroshi Inoue wrote:
 Magnus Hagander wrote:
  On 25 nov 2008, at 05.00, Tom Lane t...@sss.pgh.pa.us wrote:
  
  Hiroshi Inoue in...@tpf.co.jp writes:
  Tom Lane wrote:
  If that's true then this code is presently broken for *every* locale
  under Windows, not only Japanese.
 
  Maybe there are a few languages/countires where 2 encodings are
  widely used.
 
  UTF8 vs Latin-N?
  
  We already special-cases utf8...
  
  I think the thing us that as long as the encodings are compatible 
  (latin1 with different names for example) it worked  fine.
  
   In any case I think the problem is that gettext is
  looking at a setting that is not what we are looking at.  Particularly
  with the 8.4 changes to allow per-database locale settings, this has
  got to be fixed in a bulletproof way.
 
 Attached is a new patch to apply bind_textdomain_codeset() to most
 server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
 and PG_EUC_JIS_2004. EUC-JP may be OK for EUC_JIS_2004.
 
 Unfortunately it's hard for Saito-san and me to check encodings
 other than EUC-JP.
 
 regards,
 Hiroshi Inoue

 *** mbutils.c.origSun Nov 23 08:42:57 2008
 --- mbutils.c Wed Nov 26 12:17:12 2008
 ***
 *** 822,830 
 --- 822,870 
   return clen;
   }
   
 + #ifdef  WIN32
 + static const struct codeset_map {
 + int encoding;
 + const char *codeset;
 + } codeset_map_array[] = {
 + {PG_UTF8, UTF-8},
 + {PG_LATIN1, LATIN1}, 
 + {PG_LATIN2, LATIN2},
 + {PG_LATIN3, LATIN3},
 + {PG_LATIN4, LATIN4},
 + {PG_ISO_8859_5, ISO-8859-5},
 + {PG_ISO_8859_6, ISO_8859-6},
 + {PG_ISO_8859_7, ISO-8859-7},
 + {PG_ISO_8859_8, ISO-8859-8},
 + {PG_LATIN5, LATIN5},
 + {PG_LATIN6, LATIN6},
 + {PG_LATIN7, LATIN7},
 + {PG_LATIN8, LATIN8},
 + {PG_LATIN9, LATIN-9},
 + {PG_LATIN10, LATIN10},
 + {PG_KOI8R, KOI8-R},
 + {PG_WIN1250, CP1250},
 + {PG_WIN1251, CP1251},
 + {PG_WIN1252, CP1252},
 + {PG_WIN1253, CP1253},
 + {PG_WIN1254, CP1254},
 + {PG_WIN1255, CP1255},
 + {PG_WIN1256, CP1256},
 + {PG_WIN1257, CP1257},
 + {PG_WIN1258, CP1258},
 + {PG_WIN866, CP866},
 + {PG_WIN874, CP874},
 + {PG_EUC_CN, EUC-CN},
 + {PG_EUC_JP, EUC-JP},
 + {PG_EUC_KR, EUC-KR},
 + {PG_EUC_TW, EUC-TW}};
 + #endif /* WIN32 */
 + 
   void
   SetDatabaseEncoding(int encoding)
   {
 + const char *target_codeset = NULL;
 + 
   if (!PG_VALID_BE_ENCODING(encoding))
   elog(ERROR, invalid database encoding: %d, encoding);
   
 ***
 *** 846,852 
*/
   #ifdef ENABLE_NLS
   if (encoding == PG_UTF8)
 ! if (bind_textdomain_codeset(postgres, UTF-8) == NULL)
   elog(LOG, bind_textdomain_codeset failed);
   #endif
   }
 --- 886,907 
*/
   #ifdef ENABLE_NLS
   if (encoding == PG_UTF8)
 ! target_codeset = UTF-8;
 ! #ifdef  WIN32
 ! else
 ! {
 ! int i;
 ! 
 ! for (i = 0; i  sizeof(codeset_map_array) / sizeof(struct 
 codeset_map); i++)
 ! if (codeset_map_array[i].encoding == encoding)
 ! {
 ! target_codeset = codeset_map_array[i].codeset;
 ! break;
 ! }
 ! }
 ! #endif /* WIN32 */
 ! if (target_codeset != NULL)
 ! if (bind_textdomain_codeset(postgres, target_codeset) == NULL)
   elog(LOG, bind_textdomain_codeset failed);
   #endif
   }

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Hiroshi, is this patch still needed?

This patch is for a problem that's entirely separate from the LC_TIME
issue, if that's what you were wondering.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2009-01-09 Thread Hiroshi Inoue

Bruce Momjian wrote:

Hiroshi, is this patch still needed?


Yes though it should be slightly changed now.
*set lc_messages does not work* issue isn't directly related to this
 issue.

regards,
Hiroshi Inoue

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-12-03 Thread Magnus Hagander
Hiroshi Inoue wrote:
 I think the thing us that as long as the encodings are compatible
 (latin1 with different names for example) it worked  fine.

  In any case I think the problem is that gettext is
 looking at a setting that is not what we are looking at.  Particularly
 with the 8.4 changes to allow per-database locale settings, this has
 got to be fixed in a bulletproof way.
 
 Attached is a new patch to apply bind_textdomain_codeset() to most
 server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
 and PG_EUC_JIS_2004. EUC-JP may be OK for EUC_JIS_2004.
 
 Unfortunately it's hard for Saito-san and me to check encodings
 other than EUC-JP.


In principle this looks good, I think, but I'm a bit worried around the
lack of testing. I can do some testing under LATIN1 which is what we use
in Sweden (just need to get gettext working *at all* in my dev
environment again - I've somehow managed to break it), and perhaps we
can find someone to do a test in an eastern-european locale to get some
more datapoints?

Can you outline the steps one needs to go through to show the problem,
so we can confirm it's fixed in these locales?

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-12-03 Thread Hiroshi Inoue

Magnus Hagander wrote:

Hiroshi Inoue wrote:

I think the thing us that as long as the encodings are compatible
(latin1 with different names for example) it worked  fine.


 In any case I think the problem is that gettext is
looking at a setting that is not what we are looking at.  Particularly
with the 8.4 changes to allow per-database locale settings, this has
got to be fixed in a bulletproof way.

Attached is a new patch to apply bind_textdomain_codeset() to most
server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
and PG_EUC_JIS_2004. EUC-JP may be OK for EUC_JIS_2004.

Unfortunately it's hard for Saito-san and me to check encodings
other than EUC-JP. 


In principle this looks good, I think,  but I'm a bit worried around the
lack of testing.


Thanks and I agree with you.

 I can do some testing under LATIN1 which is what we use

in Sweden (just need to get gettext working *at all* in my dev
environment again - I've somehow managed to break it), and perhaps we
can find someone to do a test in an eastern-european locale to get some
more datapoints?

Can you outline the steps one needs to go through to show the problem,
so we can confirm it's fixed in these locales?


Saito-san and I have been working on another related problem about
changing LC_MESSAGES locale properly under Windows and would be able
to provide a patch in a few days. It seems preferable for us to apply
the patch also so as to change the message catalog easily.

Attached is an example in which LC_MESSAGES is cht_twn(zh_TW) and
the server encoding is EUC-TW. You can see it as a UTF-8 text
because the client_encoding is set to UTF-8 first.

BTW you can see another problem at line 4 in the text.
At the point the LC_MESSAGES is still japanese and postgres fails
to convert a Japanese error message to EUC_TW encoding. There's
no wonder but it doesn't seem preferable.

regards,
Hiroshi Inoue
set client_encoding to utf_8;
SET
1;
psql:cmd/euctw.sql:2: ERROR:  character 0xb9e6 of encoding EUC_TW has no 
equivalent in UTF8
select current_database();
 current_database 
--
 euctw
(1 s)

show server_encoding;
 server_encoding 
-
 EUC_TW
(1 s)

show lc_messages;
lc_messages 

 Japanese_Japan.932
(1 s)

set lc_messages to cht;
SET
select a;
psql:cmd/euctw.sql:7: 錯誤:  欄位a不存在
LINE 1: select a;
   ^
1;
psql:cmd/euctw.sql:8: 錯誤:  在語法錯誤附近發生 1
LINE 1: 1;
^
select * from a;
psql:cmd/euctw.sql:9: 錯誤:  relation a不存在
LINE 1: select * from a;
  ^

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-27 Thread Hiroshi Inoue

Magnus Hagander wrote:

On 25 nov 2008, at 05.00, Tom Lane [EMAIL PROTECTED] wrote:


Hiroshi Inoue [EMAIL PROTECTED] writes:

Tom Lane wrote:

If that's true then this code is presently broken for *every* locale
under Windows, not only Japanese.



Maybe there are a few languages/countires where 2 encodings are
widely used.


UTF8 vs Latin-N?


We already special-cases utf8...

I think the thing us that as long as the encodings are compatible 
(latin1 with different names for example) it worked  fine.



 In any case I think the problem is that gettext is
looking at a setting that is not what we are looking at.  Particularly
with the 8.4 changes to allow per-database locale settings, this has
got to be fixed in a bulletproof way.


Attached is a new patch to apply bind_textdomain_codeset() to most
server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
and PG_EUC_JIS_2004. EUC-JP may be OK for EUC_JIS_2004.

Unfortunately it's hard for Saito-san and me to check encodings
other than EUC-JP.

regards,
Hiroshi Inoue
*** mbutils.c.orig  Sun Nov 23 08:42:57 2008
--- mbutils.c   Wed Nov 26 12:17:12 2008
***
*** 822,830 
--- 822,870 
return clen;
  }
  
+ #ifdefWIN32
+ static const struct codeset_map {
+   int encoding;
+   const char *codeset;
+ } codeset_map_array[] = {
+   {PG_UTF8, UTF-8},
+   {PG_LATIN1, LATIN1}, 
+   {PG_LATIN2, LATIN2},
+   {PG_LATIN3, LATIN3},
+   {PG_LATIN4, LATIN4},
+   {PG_ISO_8859_5, ISO-8859-5},
+   {PG_ISO_8859_6, ISO_8859-6},
+   {PG_ISO_8859_7, ISO-8859-7},
+   {PG_ISO_8859_8, ISO-8859-8},
+   {PG_LATIN5, LATIN5},
+   {PG_LATIN6, LATIN6},
+   {PG_LATIN7, LATIN7},
+   {PG_LATIN8, LATIN8},
+   {PG_LATIN9, LATIN-9},
+   {PG_LATIN10, LATIN10},
+   {PG_KOI8R, KOI8-R},
+   {PG_WIN1250, CP1250},
+   {PG_WIN1251, CP1251},
+   {PG_WIN1252, CP1252},
+   {PG_WIN1253, CP1253},
+   {PG_WIN1254, CP1254},
+   {PG_WIN1255, CP1255},
+   {PG_WIN1256, CP1256},
+   {PG_WIN1257, CP1257},
+   {PG_WIN1258, CP1258},
+   {PG_WIN866, CP866},
+   {PG_WIN874, CP874},
+   {PG_EUC_CN, EUC-CN},
+   {PG_EUC_JP, EUC-JP},
+   {PG_EUC_KR, EUC-KR},
+   {PG_EUC_TW, EUC-TW}};
+ #endif /* WIN32 */
+ 
  void
  SetDatabaseEncoding(int encoding)
  {
+   const char *target_codeset = NULL;
+ 
if (!PG_VALID_BE_ENCODING(encoding))
elog(ERROR, invalid database encoding: %d, encoding);
  
***
*** 846,852 
 */
  #ifdef ENABLE_NLS
if (encoding == PG_UTF8)
!   if (bind_textdomain_codeset(postgres, UTF-8) == NULL)
elog(LOG, bind_textdomain_codeset failed);
  #endif
  }
--- 886,907 
 */
  #ifdef ENABLE_NLS
if (encoding == PG_UTF8)
!   target_codeset = UTF-8;
! #ifdefWIN32
!   else
!   {
!   int i;
! 
!   for (i = 0; i  sizeof(codeset_map_array) / sizeof(struct 
codeset_map); i++)
!   if (codeset_map_array[i].encoding == encoding)
!   {
!   target_codeset = codeset_map_array[i].codeset;
!   break;
!   }
!   }
! #endif /* WIN32 */
!   if (target_codeset != NULL)
!   if (bind_textdomain_codeset(postgres, target_codeset) == NULL)
elog(LOG, bind_textdomain_codeset failed);
  #endif
  }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Magnus Hagander
Hiroshi Inoue wrote:
 Hi Magnus and all,
 
 Magnus Hagander wrote:
 Log Message:
 ---
 Explicitly bind gettext() to the  UTF8 locale when in use.
 This is required on Windows due to the special locale
 handling for UTF8 that doesn't change the full environment.
 
 Thanks to this change UTF-8 case was solved but Japanese users
 are still unhappy with Windows databases with EUC_JP encoding.
 Shift_JIS which is a Japanese encoding under Windows doesn't
 match any server encoding and causes a crash with the use of
 gettext. So Saito-san removed ja message catalog just before
 the 8.3 release.
 
 Attached is a simple patch to avoid the crash and enable the
 use of Japanese message catalog.
 Please apply the patch if there's no problem.

Hi!

It will clearly also need an update to the comment, but I can take care
of that.

I assume you have tested this? The comment says that it works because we
are handling UTF8 on a special way on Windows, but AFAIK we *don't*
handle EUC_JP in a special way there?

If your database is in EUC_JP, I don't see why gettext() isn't picking
it up properly in the first place..

And why do we need that on Windows only, and not on other platforms?

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Hiroshi Inoue
Magnus Hagander wrote:
 Hiroshi Inoue wrote:
 Hi Magnus and all,

 Magnus Hagander wrote:
 Log Message:
 ---
 Explicitly bind gettext() to the  UTF8 locale when in use.
 This is required on Windows due to the special locale
 handling for UTF8 that doesn't change the full environment.
 Thanks to this change UTF-8 case was solved but Japanese users
 are still unhappy with Windows databases with EUC_JP encoding.
 Shift_JIS which is a Japanese encoding under Windows doesn't
 match any server encoding and causes a crash with the use of
 gettext. So Saito-san removed ja message catalog just before
 the 8.3 release.

 Attached is a simple patch to avoid the crash and enable the
 use of Japanese message catalog.
 Please apply the patch if there's no problem.
 
 Hi!
 
 It will clearly also need an update to the comment, but I can take care
 of that.
 
 I assume you have tested this?

Though I myself didn't test it, Saito-san tested it.

 The comment says that it works because we
 are handling UTF8 on a special way on Windows,

ISTM UTF-8 isn't a special case.
In fact the comment also mentions the following.

* In future we might want to call bind_textdomain_codeset
* unconditionally, but that

 If your database is in EUC_JP, I don't see why gettext() isn't picking
 it up properly in the first place..

In Japan 2 encodings (EUC_JP and Shift_JIS) are often used.
EUC_JP is mainly used on *nix and on Windows Shift_JIS is
used. We use EUC_JP not Shift_JIS as the server encoding

 And why do we need that on Windows only, and not on other platforms?

because Shift_JIS isn't allowed as a server encoding. So
the Japanese Windows native message encoding Shift_JIS never
 matches the server encoding EUC_JP and a conversion between
Shitt_jis and EUC_JP is necessarily needed.

regards,
Hiroshi Inoue


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Magnus Hagander
Hiroshi Inoue wrote:
 Magnus Hagander wrote:
 Hiroshi Inoue wrote:
 Hi Magnus and all,

 Magnus Hagander wrote:
 Log Message:
 ---
 Explicitly bind gettext() to the  UTF8 locale when in use.
 This is required on Windows due to the special locale
 handling for UTF8 that doesn't change the full environment.
 Thanks to this change UTF-8 case was solved but Japanese users
 are still unhappy with Windows databases with EUC_JP encoding.
 Shift_JIS which is a Japanese encoding under Windows doesn't
 match any server encoding and causes a crash with the use of
 gettext. So Saito-san removed ja message catalog just before
 the 8.3 release.

 Attached is a simple patch to avoid the crash and enable the
 use of Japanese message catalog.
 Please apply the patch if there's no problem.
 Hi!

 It will clearly also need an update to the comment, but I can take care
 of that.

 I assume you have tested this?
 
 Though I myself didn't test it, Saito-san tested it.

Ok, good.


 The comment says that it works because we
 are handling UTF8 on a special way on Windows,
 
 ISTM UTF-8 isn't a special case.
 In fact the comment also mentions the following.
 
   * In future we might want to call bind_textdomain_codeset
   * unconditionally, but that

I think that's partially unrelated. UTF8 is special in that the
environment that the backend runs in is different from SERVER_ENCODING
in this case. For other encodings, we have setlocale():d to the same
encoding.

 If your database is in EUC_JP, I don't see why gettext() isn't picking
 it up properly in the first place..
 
 In Japan 2 encodings (EUC_JP and Shift_JIS) are often used.
 EUC_JP is mainly used on *nix and on Windows Shift_JIS is
 used. We use EUC_JP not Shift_JIS as the server encoding
 
 And why do we need that on Windows only, and not on other platforms?
 
 because Shift_JIS isn't allowed as a server encoding. So
 the Japanese Windows native message encoding Shift_JIS never
  matches the server encoding EUC_JP and a conversion between
 Shitt_jis and EUC_JP is necessarily needed.

Ah, so we're basically hardcoding that information? The system will go
up in SJIS, but since we can't deal with it, we switch it to EUC_JP?

Ok, I think I understand. I've made some minor stylistic changes (we
don't normally use if (NULL != whatever) in the pg sources), and will
apply with those.

This is for HEAD only, correct? Or is it something we should backpatch?

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Hiroshi Inoue wrote:
 because Shift_JIS isn't allowed as a server encoding. So
 the Japanese Windows native message encoding Shift_JIS never
 matches the server encoding EUC_JP and a conversion between
 Shitt_jis and EUC_JP is necessarily needed.

 Ah, so we're basically hardcoding that information? The system will go
 up in SJIS, but since we can't deal with it, we switch it to EUC_JP?

I'm not following this either.  If the patch is really necessary then it
seems it must be working around a bug in the Windows version of gettext,
ie failure to distinguish CP932 from CP20932.  Is that correct?

 Ok, I think I understand. I've made some minor stylistic changes (we
 don't normally use if (NULL != whatever) in the pg sources), and will
 apply with those.

It definitely needs a comment explaining why this is needed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Hiroshi Inoue
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Hiroshi Inoue wrote:
 because Shift_JIS isn't allowed as a server encoding. So
 the Japanese Windows native message encoding Shift_JIS never
 matches the server encoding EUC_JP and a conversion between
 Shitt_jis and EUC_JP is necessarily needed.
 
 Ah, so we're basically hardcoding that information? The system will go
 up in SJIS, but since we can't deal with it, we switch it to EUC_JP?
 
 I'm not following this either.  If the patch is really necessary then it
 seems it must be working around a bug in the Windows version of gettext,
 ie failure to distinguish CP932 from CP20932.  Is that correct?

I'm afraid I don't understand what you mean exactly.
AFAIK the output encoding of Windows gettext is detemined by the
ANSI system code page which is usualy CP932(Shift_JIS) in Japan and
 unrelated to the locale settings.
In addition CP20932 is rarely used in Japan IIRC. I've never used it
and don't know what it is correctly (maybe a kind of EUC_JP).

regards,
Hiroshi Inoue


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Tom Lane
Hiroshi Inoue [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm not following this either.  If the patch is really necessary then it
 seems it must be working around a bug in the Windows version of gettext,
 ie failure to distinguish CP932 from CP20932.  Is that correct?

 I'm afraid I don't understand what you mean exactly.
 AFAIK the output encoding of Windows gettext is detemined by the
 ANSI system code page which is usualy CP932(Shift_JIS) in Japan and
  unrelated to the locale settings.

If that's true then this code is presently broken for *every* locale
under Windows, not only Japanese.

To my mind the really correct thing to be doing here would be to call
bind_textdomain_codeset in all cases, rather than trusting gettext to
guess correctly about which encoding we want.  As the comment notes,
we have not attempted that because the codeset names aren't well
standardized.  But it seems to me that we could certainly find out what
codeset names are used on Windows, and apply bind_textdomain_codeset
all the time on Windows.  That would make a lot more sense than ad-hoc
treatment of UTF-8 and EUC-JP if you ask me ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Hiroshi Inoue
Tom Lane wrote:
 Hiroshi Inoue [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm not following this either.  If the patch is really necessary then it
 seems it must be working around a bug in the Windows version of gettext,
 ie failure to distinguish CP932 from CP20932.  Is that correct?
 
 I'm afraid I don't understand what you mean exactly.
 AFAIK the output encoding of Windows gettext is detemined by the
 ANSI system code page which is usualy CP932(Shift_JIS) in Japan and
  unrelated to the locale settings.
 
 If that's true then this code is presently broken for *every* locale
 under Windows, not only Japanese.

Maybe there are a few languages/countires where 2 encodings are
 widely used.

 To my mind the really correct thing to be doing here would be to call
 bind_textdomain_codeset in all cases, rather than trusting gettext to
 guess correctly about which encoding we want. As the comment notes,
 we have not attempted that because the codeset names aren't well
 standardized.  But it seems to me that we could certainly find out what
 codeset names are used on Windows, and apply bind_textdomain_codeset
 all the time on Windows.  That would make a lot more sense than ad-hoc
 treatment of UTF-8 and EUC-JP if you ask me ...

I fundamentally agree with you.
What we hope is to enable the use of Japanese message catalog which
we gave up in 8.3 Windows-version release.

regards,
Hiroshi Inoue


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Tom Lane
Hiroshi Inoue [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 If that's true then this code is presently broken for *every* locale
 under Windows, not only Japanese.

 Maybe there are a few languages/countires where 2 encodings are
  widely used.

UTF8 vs Latin-N?  In any case I think the problem is that gettext is
looking at a setting that is not what we are looking at.  Particularly
with the 8.4 changes to allow per-database locale settings, this has
got to be fixed in a bulletproof way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

2008-11-24 Thread Magnus Hagander

On 25 nov 2008, at 05.00, Tom Lane [EMAIL PROTECTED] wrote:


Hiroshi Inoue [EMAIL PROTECTED] writes:

Tom Lane wrote:

If that's true then this code is presently broken for *every* locale
under Windows, not only Japanese.



Maybe there are a few languages/countires where 2 encodings are
widely used.


UTF8 vs Latin-N?


We already special-cases utf8...

I think the thing us that as long as the encodings are compatible  
(latin1 with different names for example) it worked  fine.



 In any case I think the problem is that gettext is
looking at a setting that is not what we are looking at.  Particularly
with the 8.4 changes to allow per-database locale settings, this has
got to be fixed in a bulletproof way.



Agreed.

/Magnus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers