Re: [HACKERS] Locale support is now on by default
Tom Lane writes: initdb --lc-collate, initdb --locale, LC_ALL, LC_COLLATE, LANG initdb --no-locale is the same as initdb --locale=C, for convenience. I'm confused; what is the default behavior if you don't give any switches to initdb? Whatever is set in the environment -- which boils down to LC_ALL, LC_COLLATE, LANG. It might be that Bruce's recent changes to elog levels allow a graceful compromise about backend messages during initdb. I haven't looked, but maybe initdb could run the backend with message level one notch higher than LOG to suppress all the normal-case messages without masking not- so-normal cases. I'll look. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Locale support is now on by default
Tom Lane writes: It might be that Bruce's recent changes to elog levels allow a graceful compromise about backend messages during initdb. I haven't looked, but maybe initdb could run the backend with message level one notch higher than LOG to suppress all the normal-case messages without masking not- so-normal cases. There doesn't seem to be a way to turn off LOG without hiding almost everything: if (lev == LOG || lev == COMMERROR) { if (server_min_messages == LOG) output_to_server = true; else if (server_min_messages FATAL) output_to_server = true; } Everything except for PANIC is less than FATAL, so this doesn't make sense to me. Nonetheless, I don't like the way this message comes out. It destroys the, er, well-formed display that initdb gives. Moreover, it's not really a WARNING, meaning something is wrong. I was thinking about handling this within initdb, with a display like this: The files belonging to this database system will be owned by user peter. This user must also own the server process. Locale settings: collate=en_US ctype=en_US [...] (This locale will prevent optimization of LIKE and regexp searches.) creating directory pg-install/var/data... ok creating directory pg-install/var/data/base... ok [...] Yes, we'd need to duplicate some code within initdb, but it's not like that list of LIKE-safe locales is very dynamic. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Locale support is now on by default
Peter Eisentraut [EMAIL PROTECTED] writes: I was thinking about handling this within initdb, with a display like this: The files belonging to this database system will be owned by user peter. This user must also own the server process. Locale settings: collate=en_US ctype=en_US [...] (This locale will prevent optimization of LIKE and regexp searches.) creating directory pg-install/var/data... ok creating directory pg-install/var/data/base... ok [...] That works for me. Yes, we'd need to duplicate some code within initdb, but it's not like that list of LIKE-safe locales is very dynamic. But removing the warning from xlog.c would be a Good Thing; it does not belong there either, by any stretch of the imagination. As long as both locale_is_like_safe() and initdb's list are commented with cross-links to the other one, I don't think we're creating a huge maintenance problem. BTW, I still suggest changing initdb to set message_level = FATAL rather than /dev/null'ing the output. Having to use -d to learn anything at all about the cause of an initdb-time failure is a pain in the neck. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Locale support is now on by default
BTW, I still suggest changing initdb to set message_level = FATAL rather than /dev/null'ing the output. Having to use -d to learn anything at all about the cause of an initdb-time failure is a pain in the neck. This is a great idea. Certainly there are FATAL/PANIC messages during initdb that could be helpful. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Locale support is now on by default
Bruce Momjian writes: There doesn't seem to be a way to turn off LOG without hiding almost everything: if (lev == LOG || lev == COMMERROR) { if (server_min_messages == LOG) output_to_server = true; else if (server_min_messages FATAL) output_to_server = true; } Everything except for PANIC is less than FATAL, so this doesn't make sense to me. Actually, what this is saying is that for an elog(LOG) to show, the server_min_messages, must be less than FATAL. I know what this is saying, but the coding is redundant (since LOG is also less than FATAL). Setting server_min_messages to FATAL means only FATAL and PANIC appear: Server levels are: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, panic I don't recall log being so high. Didn't it use to be after info? Certainly there should be a way to see only warnings, errors, and higher without seeing the unimportant log messages. Actually, I'm also confused why we now have info, notice, *and* warning. Shouldn't two of these be enough? -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Locale support is now on by default
Peter Eisentraut wrote: Bruce Momjian writes: There doesn't seem to be a way to turn off LOG without hiding almost everything: if (lev == LOG || lev == COMMERROR) { if (server_min_messages == LOG) output_to_server = true; else if (server_min_messages FATAL) output_to_server = true; } Everything except for PANIC is less than FATAL, so this doesn't make sense to me. Actually, what this is saying is that for an elog(LOG) to show, the server_min_messages, must be less than FATAL. I know what this is saying, but the coding is redundant (since LOG is also less than FATAL). Sure, but the ordinal value of log is different for client and server: #server_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, # panic #client_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # log, notice, warning, error The LOG value is ordinally correct for CLIENT, but for SERVER, it is just below FATAL. I can change it but for now that is what people wanted, meaning you probably want LOG in the log file before WARNINGS or even ERROR. Setting server_min_messages to FATAL means only FATAL and PANIC appear: Server levels are: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, panic I don't recall log being so high. Didn't it use to be after info? Certainly there should be a way to see only warnings, errors, and higher without seeing the unimportant log messages. Actually, I'm also confused why we now have info, notice, *and* warning. Shouldn't two of these be enough? We added NOTICE and INFO and WARNING because they were required. INFO is for SET-like information, NOTICE is for non-warnings like sequence creation for SERIAL, and WARNING is for real warnings like identifier truncation. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Locale support is now on by default
Bruce Momjian writes: Server levels are: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, panic I don't recall log being so high. Didn't it use to be after info? Certainly there should be a way to see only warnings, errors, and higher without seeing the unimportant log messages. Actually, I'm also confused why we now have info, notice, *and* warning. Shouldn't two of these be enough? We added NOTICE and INFO and WARNING because they were required. INFO is for SET-like information, NOTICE is for non-warnings like sequence creation for SERIAL, and WARNING is for real warnings like identifier truncation. OK, let me phrase my question clearly: How can I turn off LOG and turn on all errors in the server log? -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Locale support is now on by default
Peter Eisentraut wrote: Bruce Momjian writes: Server levels are: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, panic I don't recall log being so high. Didn't it use to be after info? Certainly there should be a way to see only warnings, errors, and higher without seeing the unimportant log messages. Actually, I'm also confused why we now have info, notice, *and* warning. Shouldn't two of these be enough? We added NOTICE and INFO and WARNING because they were required. INFO is for SET-like information, NOTICE is for non-warnings like sequence creation for SERIAL, and WARNING is for real warnings like identifier truncation. OK, let me phrase my question clearly: How can I turn off LOG and turn on all errors in the server log? Right now, you can't. I originally had LOG next to INFO, and for server it was INFO, then LOG, and for client, it was LOG, then INFO, but someone suggested that LOG should be between ERROR and FATAL because most people want LOG stuff before they want to see ERROR/WARNING/NOTICE in the server logs. If you would prefer LOG down near INFO in the server message levels, please post the idea and let's get some more comments from folks. We thought about going with a bitwise capability where you could turn on different messages types independently, but the use of that with SET and the confusion hardly seemed worth it. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Locale support is now on by default
Bruce Momjian writes: If you would prefer LOG down near INFO in the server message levels, please post the idea and let's get some more comments from folks. LOG should be below WARNING, in any case. Perhaps between NOTICE and WARNING, but I'm not so sure about that. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Locale support is now on by default
Peter Eisentraut [EMAIL PROTECTED] writes: Bruce Momjian writes: If you would prefer LOG down near INFO in the server message levels, please post the idea and let's get some more comments from folks. LOG should be below WARNING, in any case. Perhaps between NOTICE and WARNING, but I'm not so sure about that. I think the ordering Bruce developed is appropriate for logging. There are good reasons to think that per-query ERRORs are less interesting than LOG events for admin logging purposes. The real problem here is that in the initdb context, we are really dealing with an *interactive* situation, where LOG events ought to be treated in the client-oriented scale --- but the backend does not know this, it thinks it is emitting messages to the system log. I'm thinking that the mistake is in hard-wiring one scale of message interest to control the frontend output and another one to the log (stderr/syslog) output. Perhaps we should have a notion of interactive message priorities vs logging message priorities, and allow either scale to be used to control which messages are dispatched to any message destination. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Locale support is now on by default
Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Bruce Momjian writes: If you would prefer LOG down near INFO in the server message levels, please post the idea and let's get some more comments from folks. LOG should be below WARNING, in any case. Perhaps between NOTICE and WARNING, but I'm not so sure about that. I think the ordering Bruce developed is appropriate for logging. There are good reasons to think that per-query ERRORs are less interesting than LOG events for admin logging purposes. OK. The real problem here is that in the initdb context, we are really dealing with an *interactive* situation, where LOG events ought to be treated in the client-oriented scale --- but the backend does not know this, it thinks it is emitting messages to the system log. I'm thinking that the mistake is in hard-wiring one scale of message interest to control the frontend output and another one to the log (stderr/syslog) output. Perhaps we should have a notion of interactive message priorities vs logging message priorities, and allow either scale to be used to control which messages are dispatched to any message destination. Can't we just 'grep -v '^LOG:' to remove the log display from initdb? Seems pretty simple. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Locale support is now on by default
The determination of locale is now done as follows: collate/ctype: initdb --lc-collate, initdb --locale, LC_ALL, LC_COLLATE, LANG messages/monetary/numeric/time: Have GUC variables lc_messages, etc. The default is , which means to inherit from the environment (or whatever setlocale() does with it). However, initdb will initialize postgresql.conf containing assignments to these variables determined as with collate/ctype above. So the real defaults are consistent with collate/ctype. initdb --no-locale is the same as initdb --locale=C, for convenience. Let's see if these rules end up making sense to everybody. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Locale support is now on by default
Peter Eisentraut [EMAIL PROTECTED] writes: The determination of locale is now done as follows: initdb --lc-collate, initdb --locale, LC_ALL, LC_COLLATE, LANG initdb --no-locale is the same as initdb --locale=C, for convenience. I'm confused; what is the default behavior if you don't give any switches to initdb? BTW, something that's been bothering me for awhile is that the notice we stuck into the backend a couple versions back (about this locale disables LIKE optimizations) is being hidden by initdb, because you decided recently that it was okay to route all the backend's commentary to /dev/null so as to hide xlog.c's startup chattiness. I don't object to getting rid of that chattiness, but 2/dev/null is throwing the baby out with the bathwater (consider outright failure messages, for instance). It might be that Bruce's recent changes to elog levels allow a graceful compromise about backend messages during initdb. I haven't looked, but maybe initdb could run the backend with message level one notch higher than LOG to suppress all the normal-case messages without masking not- so-normal cases. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] locale support
Tatsuo, what is LC_ALL (or the other locale envvars) set to when you run the program? The man page for setlocale() on my machine documents that the main() starts in C or POSIX locale mode by default. The call to setlocale(LC_ALL, "") reads the envvars and sets the locale accordingly. Maybe RedHat's 6.2J isn't setting up the locale properly to begin with? See what /etc/sysconfig/i18n contains -- if it is empty or doesn't exist, then locale is simply not set up. But you specfically mention the particular locale It's "ja_JP.eucJP". Definitely that locale exists, so I guess the contents is broken... Ok, what combinations _do_ work? We _know_ C or POSIX works -- but which ones don't work, on RH 6.1? While I want to make sure that a broken locale data set isn't used, I also want to make sure that a good locale set isn't thrown out, either. Forcing to LC_COLLATE=C is overkill, IMHO. And building without locale support doesn't work, I guess most single byte locales work. However I seriously doubt that locales for multibyte language would work. either, because, at least on RH 6.1, strncmp() is buggered to use the locale's collation. Really? I see PostgreSQL installations without the locale support work just fine on RH 6.1J. The real solution is for the vendors to fix their broken locales. Of course. -- Tatsuo Ishii
Re: [HACKERS] locale support
Nathan Myers wrote: On Mon, Feb 12, 2001 at 09:59:37PM -0500, Tom Lane wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: I know this is not PostgreSQL's fault but the broken locale data on certain platforms. The problem makes it impossible to use PostgreSQL RPMs in Japan. I'm looking for solutions/workarounds for this problem. Build a set of RPMs without locale support? Run it with LC_ALL="C". It would help if there was a sample working LC_ALL=xxx line /etc/rc.d/init.d/postgresql As it stands now it is a real pita to get LC_xx settings down to the real postmaster through all the layers (and quessing if it did take effect after each restart ;) - Hannu
Re: [HACKERS] locale support
Tatsuo Ishii [EMAIL PROTECTED] writes: I know this is not PostgreSQL's fault but the broken locale data on certain platforms. The problem makes it impossible to use PostgreSQL RPMs in Japan. I'm looking for solutions/workarounds for this problem. Build a set of RPMs without locale support? Run it with LC_ALL="C". Both of them seem not ideal solutions for RPM. It would be nice if we could distribute single binary and start up file in RPM. If you can find a non-intrusive way to do that, sure ... but I don't think that we should expend any great amount of effort, nor uglify the code, in order to cater to a demonstrably broken library on one particular platform. The LC_ALL answer seems the best to me. regards, tom lane
Re: [HACKERS] locale support
Tom Lane wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: I know this is not PostgreSQL's fault but the broken locale data on certain platforms. The problem makes it impossible to use PostgreSQL RPMs in Japan. I'm looking for solutions/workarounds for this problem. Build a set of RPMs without locale support? Run it with LC_ALL="C". Both of them seem not ideal solutions for RPM. It would be nice if we could distribute single binary and start up file in RPM. If you can find a non-intrusive way to do that, sure ... but I don't think that we should expend any great amount of effort, nor uglify the code, in order to cater to a demonstrably broken library on one particular platform. Tatsuo, what is LC_ALL (or the other locale envvars) set to when you run the program? The man page for setlocale() on my machine documents that the main() starts in C or POSIX locale mode by default. The call to setlocale(LC_ALL, "") reads the envvars and sets the locale accordingly. Maybe RedHat's 6.2J isn't setting up the locale properly to begin with? See what /etc/sysconfig/i18n contains -- if it is empty or doesn't exist, then locale is simply not set up. But you specfically mention the particular locale Ok, what combinations _do_ work? We _know_ C or POSIX works -- but which ones don't work, on RH 6.1? While I want to make sure that a broken locale data set isn't used, I also want to make sure that a good locale set isn't thrown out, either. Forcing to LC_COLLATE=C is overkill, IMHO. And building without locale support doesn't work, either, because, at least on RH 6.1, strncmp() is buggered to use the locale's collation. The real solution is for the vendors to fix their broken locales. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Re: [HACKERS] locale support
Lamar Owen writes: And building without locale support doesn't work, either, because, at least on RH 6.1, strncmp() is buggered to use the locale's collation. I don't think so. On RH 6.1, strncmp() is the same it's ever been: int strncmp (s1, s2, n) const char *s1; const char *s2; size_t n; { unsigned reg_char c1 = '\0'; unsigned reg_char c2 = '\0'; if (n = 4) { size_t n4 = n 2; do { c1 = (unsigned char) *s1++; c2 = (unsigned char) *s2++; if (c1 == '\0' || c1 != c2) return c1 - c2; c1 = (unsigned char) *s1++; c2 = (unsigned char) *s2++; if (c1 == '\0' || c1 != c2) return c1 - c2; c1 = (unsigned char) *s1++; c2 = (unsigned char) *s2++; if (c1 == '\0' || c1 != c2) return c1 - c2; c1 = (unsigned char) *s1++; c2 = (unsigned char) *s2++; if (c1 == '\0' || c1 != c2) return c1 - c2; } while (--n4 0); n = 3; } while (n 0) { c1 = (unsigned char) *s1++; c2 = (unsigned char) *s2++; if (c1 == '\0' || c1 != c2) return c1 - c2; n--; } return c1 - c2; } -- Peter Eisentraut [EMAIL PROTECTED] http://yi.org/peter-e/
Re: [HACKERS] locale support
Peter Eisentraut wrote: Lamar Owen writes: And building without locale support doesn't work, either, because, at least on RH 6.1, strncmp() is buggered to use the locale's collation. I don't think so. On RH 6.1, strncmp() is the same it's ever been: [snip] Is that the code after any glibc RPM patches are applied? 'Pristine source, perhaps -- but patch like crazy!' Reference the classic 'Reflections on Trusting Trust' by Ken Thompson (which you have probably read already, but, for those on-list who may not have read this classic work on security, you can find the paper at http://www.acm.org/classics/sep95/). Although reading the glibc spec file indicates that patching isn't done in the 'conventional' manner here. (Lovely). I base my assertion on running test queries on a RedHat 6.1 box over a year ago, using the non-locale 6.5.3 RPMset I distributed at that point (I distributed non-locale RPMs because of it's speed being greater in indexing, etc). The user who was having difficulties also tried the non-locale RPMset -- and no change, until removing /etc/sysconfig/i18n. I've referenced the thread before in the archives; see the message http://www.postgresql.org/mhonarc/pgsql-hackers/1999-12/msg00678.html for the middle of the thread. But, of course, that was 6.5.3. If 7.x behaves differently, I wouldn't know, as I've not built a 'non-locale' RPMset of 7.x. But, I can if needed. Or try the test queries on your own RH 7 box, with a non-locale build. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Re: [HACKERS] locale support
Lamar Owen writes: I don't think so. On RH 6.1, strncmp() is the same it's ever been: [snip] Is that the code after any glibc RPM patches are applied? Yes. I base my assertion on running test queries on a RedHat 6.1 box over a year ago, using the non-locale 6.5.3 RPMset I distributed at that point (I distributed non-locale RPMs because of it's speed being greater in indexing, etc). The user who was having difficulties also tried the non-locale RPMset -- and no change, until removing /etc/sysconfig/i18n. I recall that thread, but the conclusion that was reached (that strncmp() is at fault in some way) was never proved sufficiently. -- Peter Eisentraut [EMAIL PROTECTED] http://yi.org/peter-e/
[HACKERS] locale support
There is a serious problem with the PostgreSQL locale support on certain platforms and certain locale combo. That is: simply ordering, indexes etc. are broken because strcoll() does not work. Example combo includes: RedHat 6.2J(Japanese localized version) + ja_JP.eucJP locale. Here is a test program that expose the problem. #include string.h #include locale.h main() { static char *s1 = "a Japanese string"; static char *s2 = "another Japanese string"; setlocale(LC_ALL,""); printf("%d\n",strcoll(s1,s2)); printf("%d\n",strcoll(s2,s1)); } This program prints 0s, that means strcoll() regards that those differnt Japanese strings are same! I know this is not PostgreSQL's fault but the broken locale data on certain platforms. The problem makes it impossible to use PostgreSQL RPMs in Japan. I'm looking for solutions/workarounds for this problem. Maybe we should disable locale support at runntime if strcoll() does not work? Comments? -- Tatsuo Ishii
Re: [HACKERS] locale support
On Mon, Feb 12, 2001 at 09:59:37PM -0500, Tom Lane wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: I know this is not PostgreSQL's fault but the broken locale data on certain platforms. The problem makes it impossible to use PostgreSQL RPMs in Japan. I'm looking for solutions/workarounds for this problem. Build a set of RPMs without locale support? Run it with LC_ALL="C". Nathan Myers [EMAIL PROTECTED]
Re: [HACKERS] locale support
Tatsuo Ishii [EMAIL PROTECTED] writes: I know this is not PostgreSQL's fault but the broken locale data on certain platforms. The problem makes it impossible to use PostgreSQL RPMs in Japan. I'm looking for solutions/workarounds for this problem. Build a set of RPMs without locale support? regards, tom lane