On 09/15/2014 07:51 AM, Noah Misch wrote:
libintl replaces setlocale(). Its setlocale(LC_x, "") uses OS-specific APIs to determine the default locale when $LANG and similar environment variables are empty, as they are during "make check NO_LOCALE=1". On OS X, it calls[1] CFLocaleCopyCurrent(), which in turn spins up a thread. See the end of this message for the postmaster thread stacks active upon hitting a breakpoint set at _dispatch_mgr_thread.
Ugh. I'd call that a bug in libintl. setlocale() has no business to make the process multi-threaded.
Do we have the same problem in backends? At a quick glance, aside from postmaster we only use PG_SETMASK(&BlockSig) in signal handlers, to prevent another signal handler from running concurrently.
I see two options for fixing this in pg_perm_setlocale(LC_x, ""): 1. Fork, call setlocale(LC_x, "") in the child, pass back the effective locale name through a pipe, and pass that name to setlocale() in the original process. The short-lived child will get the extra threads, and the postmaster will remain clean. 2. On OS X, check for relevant environment variables. Finding none, set LC_x=C before calling setlocale(LC_x, ""). A variation is to raise ereport(FATAL) if sufficient environment variables aren't in place. Either way ensures the libintl setlocale() will never call CFLocaleCopyCurrent(). This is simpler than (1), but it entails a behavior change: "LANG= initdb" will use LANG=C or fail rather than use the OS X user account locale. I'm skeptical of the value of looking up locale information using other OS X facilities when the usual environment variables are inconclusive, but I see no clear cause to reverse that decision now. I lean toward (1).
Both of those are horrible hacks. And who's to say that calling setlocale(LC_x, "foo") won't also call some function that makes the process multi-threaded. If not in any current OS X release, it might still happen in a future one.
One idea would be to use an extra pthread mutex or similar, in addition to PG_SETMASK(). Whenever you do PG_SETMASK(&BlockSig), also grab the mutex, and release it when you do PG_SETMASK(&UnBlockSig).
It would be nice to stop doing non-trivial things in the signal handler in the first place. It's pretty scary, even though it works when the process is single-threaded. I believe the reason it's currently implemented like that are the same problems that the latch code solves with the self-pipe trick: select() is not interrupted by a signal on all platforms, and even if it was, you would need pselect() with is not available (or does not work correctly even if it exists) on all platforms. I think we could use a latch in postmaster too.
- Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers