[HACKERS] translations

2004-09-09 Thread Dennis Bjorklund
I've been going over the translation again and found 2 places that was 
broken.

In libpq there was some files that was not scanned for translated strings
and in scan.l there was a call to gettext() missing which made error
messages into a mix of english and swedish (in my case).

Normally I just commit the swedish translations, but now I commited these
source fixes as well so we get fully translated error messages like:

dennis=# SEL;
FEL:  syntaxfel vid eller nära SEL vid tecken 1
RAD 1: SEL;

instead of the previous

dennis=# SEL;
FEL:  syntax error vid eller nära SEL at character 1
RAD 1: SEL;

which just looks stupid (in both english and swedish). If there are 
complains I can revert it and send it to -patches.

Overall the translation seems to work fairly well. I have still not got a
total translation, but it's getting there. Maybe for 8.1.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] translations

2004-09-09 Thread Alvaro Herrera
On Thu, Sep 09, 2004 at 09:09:31AM +0200, Dennis Bjorklund wrote:
 I've been going over the translation again and found 2 places that was 
 broken.
 
 In libpq there was some files that was not scanned for translated strings
 and in scan.l there was a call to gettext() missing which made error
 messages into a mix of english and swedish (in my case).

I see this problem too.  I was about to complain.  Not sure if this is
the best fix, but it certainly 'needs fixed'.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
The ability to monopolize a planet is insignificant
next to the power of the source


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] translations

2004-09-09 Thread Dennis Bjorklund
On Thu, 9 Sep 2004, Alvaro Herrera wrote:

  In libpq there was some files that was not scanned for translated strings
  and in scan.l there was a call to gettext() missing which made error
  messages into a mix of english and swedish (in my case).
 
 I see this problem too.  I was about to complain.  Not sure if this is
 the best fix, but it certainly 'needs fixed'.

Since the parser calls yyerror() with strings we can't do the gettext call 
beforehand, which only leaves it to be done inside the yyerror() function.

xgettext sees yyerror as a markup functions, so the strings generated by
the parser are all in the po file (but was not used since there was no 
call to gettext).

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] Translations in the distributions

2004-01-09 Thread Dennis Björklund
The default installation in fedora does not work very well for non 
english people. For example. if I run psql and type COMMIT i get:

dennis=# commit;
WARNING:  COMMIT: ingen transaktion p g

while it should say

dennis=# commit;
WARNING:  COMMIT: ingen transaktion pågår

And those spaces in the first version are no spaces at all but some 
strange characters.

However, I have the cvs version compiled and installed, and it seems to
work just fine. Is this because pg has been fixed lately (I don't remember
any such discussions) or something with the packaging, or something else.  

What I want is that future fedora/redhat versions work out of the box.
Most people use distributions and it's no fun to translate postgresql if
people are annoyed with the result :-)

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Tom Lane
=?ISO-8859-1?Q?Dennis_Bj=F6rklund?= [EMAIL PROTECTED] writes:
 The default installation in fedora does not work very well for non 
 english people.

I seem to recall some discussion to the effect that the message catalog
files have to be in the same encoding the database is using, because
there's no provision in the backend for converting them on-the-fly.
Peter E. would be the person to ask though.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Dennis Björklund
On Fri, 9 Jan 2004, Tom Lane wrote:

 I seem to recall some discussion to the effect that the message catalog
 files have to be in the same encoding the database is using, because
 there's no provision in the backend for converting them on-the-fly.

Still, my cvs tree seems to work. The catalogues are still in latin1 and 
fedora still uses utf-8. So something seems to have made it work (probably 
Peter).

I know we have had some discussions in the past but I've never really got
the whole picture of the problem. In any way, now that distributions
starts to change to utf-8, it puts greater demands on us since one
encoding might not work as good anymore (it never really worked, but that
is another issue).

Maybe it all just works now and when redhat/fedora starts to use 7.4 all
will be fine. All I want it to make sure that it works. If it's not
working, it's something that I might spend some time on trying to fix.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Peter Eisentraut
Am Freitag, 9. Januar 2004 08:08 schrieb Dennis Björklund:
 The default installation in fedora does not work very well for non
 english people. For example. if I run psql and type COMMIT i get:

 dennis=# commit;
 WARNING:  COMMIT: ingen transaktion p g

 while it should say

 dennis=# commit;
 WARNING:  COMMIT: ingen transaktion pågår

Remember that gettext will automatically recode the strings depending on what 
it thinks is the display character set, determined via LC_CTYPE (of course, a 
useless concept for server software).  After that, PostgreSQL's own client/
server recoding will happen.  So somewhere along the line there something 
might get lost.  Either the RPM package uses a different locale, or it has 
bugs in gettext or iconv.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[HACKERS] Translations in distributions 2

2004-01-09 Thread Dennis Björklund
I've made some tests to see what works and what does not. I downloaded pg
7.3.4 (which is more or less what is used in fedora) and current cvs. Both
compiled with the same flags and run in the same way.

pg 7.3.4


Running LC_ALL=sv_SE postmaster
and LC_ALL=sv_SE.UTF-8 postmaster

both produces messages with same encoding.

pg 7.5 (cvs)


Running LC_ALL=sv_SE postmaster
and LC_ALL=sv_SE.UTF-8 postmaster

produces messages with different encodings, either latin1 or utf-8 
depending on the environment variable.

The only conclusion I have so far is that something have indeed changed in
pg after 7.3 and maybe in the future it will work a little better in
fedora and other dists.

The problems with client/server having different encodings still remains
to be solved. That is probably solvable by simply translating the messages
to client_encoding before sending. It does not sound very hard. The
language used should however also be that of the client and it might need
(a lot) more work.

A last question. Why is --enable-nls needed? Most other programs default
to that.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Freitag, 9. Januar 2004 15:51 schrieb Tom Lane:
 Hmm.  So the problem would appear if LC_CTYPE is different from the
 database encoding?  Could we fix it by forcing LC_CTYPE to the database
 encoding during startup?

 That would resolve quite a few problems, but I don't think there's a way to 
 know what encoding a given LC_CTYPE value will result in.

Hmm.  Actually it looks like we already do what I had in mind:

ReadControlFile():
if (setlocale(LC_CTYPE, ControlFile-lc_ctype) == NULL)
ereport(FATAL, ...

So the problem really occurs when database_encoding is set to an
encoding that is incompatible with the one implied by the initdb-time
LC_CTYPE ... which we have no good way to check.  Ugh.

I have some vague recollection that glibc offers an API extension that
allows this to be checked.  Is it worth having a solution that catches
the problem on glibc only?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Remember that gettext will automatically recode the strings depending
 on what it thinks is the display character set, determined via
 LC_CTYPE (of course, a useless concept for server software).

Hmm.  So the problem would appear if LC_CTYPE is different from the
database encoding?  Could we fix it by forcing LC_CTYPE to the database
encoding during startup?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Dennis Björklund
On Fri, 9 Jan 2004, Tom Lane wrote:

  on what it thinks is the display character set, determined via
  LC_CTYPE (of course, a useless concept for server software).
 
 Hmm.  So the problem would appear if LC_CTYPE is different from the
 database encoding?  Could we fix it by forcing LC_CTYPE to the database
 encoding during startup?

What does database encoding has to do with error messages and the display 
character set?

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Peter Eisentraut
Am Freitag, 9. Januar 2004 15:51 schrieb Tom Lane:
 Hmm.  So the problem would appear if LC_CTYPE is different from the
 database encoding?  Could we fix it by forcing LC_CTYPE to the database
 encoding during startup?

That would resolve quite a few problems, but I don't think there's a way to 
know what encoding a given LC_CTYPE value will result in.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Peter Eisentraut
Am Freitag, 9. Januar 2004 16:28 schrieb Dennis Björklund:
 What does database encoding has to do with error messages and the display
 character set?

When they are sent over the wire, the messages are converted from server 
(=database) encoding to client encoding.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Peter Eisentraut
Tom Lane wrote:
 So the problem really occurs when database_encoding is set to an
 encoding that is incompatible with the one implied by the initdb-time
 LC_CTYPE ... which we have no good way to check.  Ugh.

 I have some vague recollection that glibc offers an API extension
 that allows this to be checked.  Is it worth having a solution that
 catches the problem on glibc only?

The problem is more likely to be that it will be hard to match up the 
different encoding names.  For example, if you set LC_CTYPE=C, then the 
system encoding is report as

$ locale charmap
ANSI_X3.4-1968

whereas the closest thing in PostgreSQL would be SQL_ASCII.

It might already help if we allowed LC_CTYPE to be attached to a 
database rather than the entire cluster, and make the user match them 
up manually.  The only drawback would be that indexes on global tables 
involving upper() or lower() would no longer work reliably.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Translations in the distributions

2004-01-09 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 It might already help if we allowed LC_CTYPE to be attached to a 
 database rather than the entire cluster, and make the user match them 
 up manually.  The only drawback would be that indexes on global tables 
 involving upper() or lower() would no longer work reliably.

Make that indexes on global tables involving any text wouldn't work.
Everyone has to have the same notion of the sort order, or the index is
corrupt from someone's point of view, and soon from everyone's point of
view.  upper/lower isn't needed to cause a problem.

However ... we do not have any global tables with indexed text columns.
Only name columns, and name comparisons are presently not locale-aware
(they're just strncmp()).  I think it wouldn't be unreasonable to
legislate that this remain true forevermore, and then it would be safe
to allow different DBs to run in different locales.  That would be a big
step forward, for sure.

[ thinks more... ]  Actually it's a bigger restriction than that.
Imagine that you create some tables with text data in template1, and
then index them.  The indexes would be corrupt if you cloned template1
and assigned the result a different locale.  So to make this work, we'd
actually need the following restrictions:

* No system table can ever have an index on a text/varchar/char column;
  only name columns, and name has to remain locale-unaware.

* You can't assign a new locale to a cloned database if the source has
  any text/varchar/char indexes.

The simplest implementation restriction I can think of to guarantee
point 2 is to allow changing the locale only when cloning template0,
not when cloning anything else.  Or we could just warn people that
they'd better reindex after changing the locale.

It does seem like this might be a reasonable path to take.  Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]