In perl.git, the branch blead has been updated <http://perl5.git.perl.org/perl.git/commitdiff/a0b5329765c5b5a7cbb69a2d628d4cea8f0323c1?hp=68b940afc96546256736bc5d8185075ddc12b205>
- Log ----------------------------------------------------------------- commit a0b5329765c5b5a7cbb69a2d628d4cea8f0323c1 Author: Karl Williamson <[email protected]> Date: Sat Apr 9 11:55:58 2016 -0600 PATCH: [perl #127708] segfault in "$!" in threads This was showing up on Darwin because its setlocale is particularly not thread safe. But the problem is more generic. Using locales just isn't a good idea in a threaded application, as locales are process-wide, not thread-specific. Calling setlocale() changes the locale of all threads at the same time. Further the return of setlocale() is a pointer to internal storage. If you call setlocale() just to learn what it currently is without actually changing the locale, there is no guarantee that another thread won't interrupt your thread, switching the locale to something else before you've had a chance to copy it somewhere else for safekeeping, and the internal storage may have been freed during that interruption, leading to things like segfaults. This is a problem that has been around in the locale handling code for a long time. I don't know why it hasn't shown up before, or maybe it has and is not reproducible because it's timing dependent, and so any problems didn't have tickets written for them, or were rejected as not reproducible. But the problem has been made worse in recent releases. Only fairly recently has perl changed so this problem can occur in programs that don't use locale explicitly: ones that don't 'use locale' nor call setlocale(). This ticket is for such a program that gets a locale-related segfault, without ever touching locales itself. I have done an audit of the perl source, looking for all such occurrences, and this patch fixes all of them that I found. The only other ones, besides "$!", is in converting to/from UTF-8 in cygwin.c. In all such cases, perl briefly switches the locale, does an operation, then switches back. The solution here is to add mutexes to make these areas of code uninterruptible critical sections, so that they can rely on having the locale be what they expect it to be during the entirety of the operation, and can't have a setlocale() from another thread free internal storage. But this is not a general solution. A thread executing these sections can interrupt some other thread doing a setlocale() and zap that. However, we have long cautioned against doing setlocales() in a thread, and that caution was strengthened in a commit made yesterday, fc82b82ef4784a38877f35f56ee16f14934460ce. The current commit should make safe all threaded programs that don't use locales explicitly. It's too close to the 5.24 release to do the rearchitecting required for a general solution. That would involve adding more critical sections. POSIX 2008 introduced new locale handling functions that are thread-safe, and affect only a single thread, and don't require mutexes. The ultimate solution would be to use those tools where available, and to hide from the outer code which set is being used. Thus, perl would be thread-safe on such platforms, while remaining problematic on older ones, though fixed so segfaults wouldn't occur. Tony Cook believes we could emulate the newer behavior on all platforms at a significant performance penalty. I think this would require a lot of code, and suspect there would be glitches in it for XS code. But he may have some ideas about how to do it simply. In any case, this has to wait until post 5.24. Three other notes: It seems to me that the cygwin code could be replaced by equivalent code that doesn't use locales at all. The comments in the source seem to even indicate that. I'll look into doing this in 5.25. Another possible reason that this hasn't shown up in earlier perls is that the problems may have been entirely affecting I/O operations and there are already mutexes involving I/O, and so those could be inadvertently protecting from, or at least minimizing, the problems found here. I haven't investigated to verify this. This commit doesn't add a test. I am asking on p5p for assistance in writing one M cygwin/cygwin.c M locale.c M perl.h M pod/perldelta.pod commit 929e12133425199fce3fe026156c10876ac0cbb8 Author: Karl Williamson <[email protected]> Date: Wed Apr 6 21:46:28 2016 -0600 Add locale mutex This adds a new mutex for use in the next commit for use with locale handling. M dosish.h M embedvar.h M makedef.pl M perl.c M perl.h M perlapi.h M perlvars.h M symbian/symbianish.h M unixish.h M vms/vmsish.h M win32/win32.c M win32/wince.c ----------------------------------------------------------------------- Summary of changes: cygwin/cygwin.c | 41 +++++++++++++++++++++++++++++++++++++---- dosish.h | 6 +++--- embedvar.h | 2 ++ locale.c | 15 ++++++++++++++- makedef.pl | 1 + perl.c | 1 + perl.h | 11 +++++++++++ perlapi.h | 2 ++ perlvars.h | 2 ++ pod/perldelta.pod | 5 ++++- symbian/symbianish.h | 6 ++++-- unixish.h | 14 ++++++++------ vms/vmsish.h | 3 ++- win32/win32.c | 1 + win32/wince.c | 1 + 15 files changed, 93 insertions(+), 18 deletions(-) diff --git a/cygwin/cygwin.c b/cygwin/cygwin.c index 59aa730..24b278f 100644 --- a/cygwin/cygwin.c +++ b/cygwin/cygwin.c @@ -154,7 +154,15 @@ wide_to_utf8(const wchar_t *wbuf) { char *buf; int wlen = 0; - char *oldlocale = setlocale(LC_CTYPE, NULL); + char *oldlocale; + dVAR; + + /* Here and elsewhere in this file, we have a critical section to prevent + * another thread from changing the locale out from under us. XXX But why + * not just use uvchr_to_utf8? */ + LOCALE_LOCK; + + oldlocale = setlocale(LC_CTYPE, NULL); setlocale(LC_CTYPE, "utf-8"); /* uvchr_to_utf8(buf, chr) or Encoding::_bytes_to_utf8(sv, "UCS-2BE"); */ @@ -164,6 +172,9 @@ wide_to_utf8(const wchar_t *wbuf) if (oldlocale) setlocale(LC_CTYPE, oldlocale); else setlocale(LC_CTYPE, "C"); + + LOCALE_UNLOCK; + return buf; } @@ -172,8 +183,13 @@ utf8_to_wide(const char *buf) { wchar_t *wbuf; mbstate_t mbs; - char *oldlocale = setlocale(LC_CTYPE, NULL); + char *oldlocale; int wlen = sizeof(wchar_t)*strlen(buf); + dVAR; + + LOCALE_LOCK; + + oldlocale = setlocale(LC_CTYPE, NULL); setlocale(LC_CTYPE, "utf-8"); wbuf = (wchar_t *) safemalloc(wlen); @@ -182,6 +198,9 @@ utf8_to_wide(const char *buf) if (oldlocale) setlocale(LC_CTYPE, oldlocale); else setlocale(LC_CTYPE, "C"); + + LOCALE_UNLOCK; + return wbuf; } #endif /* cygwin 1.7 */ @@ -280,7 +299,12 @@ XS(XS_Cygwin_win_to_posix_path) wchar_t *wbuf = (wchar_t *) safemalloc(wlen); if (!IN_BYTES) { mbstate_t mbs; - char *oldlocale = setlocale(LC_CTYPE, NULL); + char *oldlocale; + dVAR; + + LOCALE_LOCK; + + oldlocale = setlocale(LC_CTYPE, NULL); setlocale(LC_CTYPE, "utf-8"); /* utf8_to_uvchr_buf(src_path, src_path + wlen, wpath) or Encoding::_utf8_to_bytes(sv, "UCS-2BE"); */ wlen = mbsrtowcs(wpath, (const char**)&src_path, wlen, &mbs); @@ -288,6 +312,8 @@ XS(XS_Cygwin_win_to_posix_path) err = cygwin_conv_path(what, wpath, wbuf, wlen); if (oldlocale) setlocale(LC_CTYPE, oldlocale); else setlocale(LC_CTYPE, "C"); + + LOCALE_UNLOCK; } else { /* use bytes; assume already ucs-2 encoded bytestream */ err = cygwin_conv_path(what, src_path, wbuf, wlen); } @@ -365,7 +391,12 @@ XS(XS_Cygwin_posix_to_win_path) int wlen = sizeof(wchar_t)*(len + 260 + 1001); wchar_t *wpath = (wchar_t *) safemalloc(sizeof(wchar_t)*len); wchar_t *wbuf = (wchar_t *) safemalloc(wlen); - char *oldlocale = setlocale(LC_CTYPE, NULL); + char *oldlocale; + dVAR; + + LOCALE_LOCK; + + oldlocale = setlocale(LC_CTYPE, NULL); setlocale(LC_CTYPE, "utf-8"); if (!IN_BYTES) { mbstate_t mbs; @@ -388,6 +419,8 @@ XS(XS_Cygwin_posix_to_win_path) wcsrtombs(win_path, (const wchar_t **)&wbuf, wlen, NULL); if (oldlocale) setlocale(LC_CTYPE, oldlocale); else setlocale(LC_CTYPE, "C"); + + LOCALE_UNLOCK; } else { int what = absolute_flag ? CCP_POSIX_TO_WIN_A : CCP_POSIX_TO_WIN_A | CCP_RELATIVE; win_path = (char *) safemalloc(len + 260 + 1001); diff --git a/dosish.h b/dosish.h index 2e4e745..c1305cd 100644 --- a/dosish.h +++ b/dosish.h @@ -52,9 +52,9 @@ #endif /* DJGPP */ #ifndef PERL_SYS_TERM_BODY -# define PERL_SYS_TERM_BODY() \ - HINTS_REFCNT_TERM; OP_CHECK_MUTEX_TERM; \ - OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM +# define PERL_SYS_TERM_BODY() \ + HINTS_REFCNT_TERM; OP_CHECK_MUTEX_TERM; \ + OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM; LOCALE_TERM; #endif #define dXSUB_SYS diff --git a/embedvar.h b/embedvar.h index c366d47..7e551be 100644 --- a/embedvar.h +++ b/embedvar.h @@ -387,6 +387,8 @@ #define PL_Ghints_mutex (my_vars->Ghints_mutex) #define PL_keyword_plugin (my_vars->Gkeyword_plugin) #define PL_Gkeyword_plugin (my_vars->Gkeyword_plugin) +#define PL_locale_mutex (my_vars->Glocale_mutex) +#define PL_Glocale_mutex (my_vars->Glocale_mutex) #define PL_malloc_mutex (my_vars->Gmalloc_mutex) #define PL_Gmalloc_mutex (my_vars->Gmalloc_mutex) #define PL_mmap_page_size (my_vars->Gmmap_page_size) diff --git a/locale.c b/locale.c index 6de9893..bf8713a 100644 --- a/locale.c +++ b/locale.c @@ -1809,13 +1809,21 @@ Perl__is_in_locale_category(pTHX_ const bool compiling, const int category) char * Perl_my_strerror(pTHX_ const int errnum) { + dVAR; /* Uses C locale for the error text unless within scope of 'use locale' for * LC_MESSAGES */ #ifdef USE_LOCALE_MESSAGES if (! IN_LC(LC_MESSAGES)) { - char * save_locale = setlocale(LC_MESSAGES, NULL); + char * save_locale; + + /* We have a critical section to prevent another thread from changing + * the locale out from under us (or zapping the buffer returned from + * setlocale() ) */ + LOCALE_LOCK; + + save_locale = setlocale(LC_MESSAGES, NULL); if (! isNAME_C_OR_POSIX(save_locale)) { char *errstr; @@ -1830,8 +1838,13 @@ Perl_my_strerror(pTHX_ const int errnum) { setlocale(LC_MESSAGES, save_locale); Safefree(save_locale); + + LOCALE_UNLOCK; + return errstr; } + + LOCALE_UNLOCK; } #endif diff --git a/makedef.pl b/makedef.pl index 78ee0b1..104696c 100644 --- a/makedef.pl +++ b/makedef.pl @@ -364,6 +364,7 @@ unless ($define{'USE_ITHREADS'}) { PL_regex_padav PL_dollarzero_mutex PL_hints_mutex + PL_locale_mutex PL_my_ctx_mutex PL_perlio_mutex PL_stashpad diff --git a/perl.c b/perl.c index 8c8eec1..228a0d8 100644 --- a/perl.c +++ b/perl.c @@ -93,6 +93,7 @@ S_init_tls_and_interp(PerlInterpreter *my_perl) OP_REFCNT_INIT; OP_CHECK_MUTEX_INIT; HINTS_REFCNT_INIT; + LOCALE_INIT; MUTEX_INIT(&PL_dollarzero_mutex); MUTEX_INIT(&PL_my_ctx_mutex); # endif diff --git a/perl.h b/perl.h index fd716c3..396bc92 100644 --- a/perl.h +++ b/perl.h @@ -5954,6 +5954,13 @@ typedef struct am_table_short AMTS; #ifdef USE_LOCALE /* These locale things are all subject to change */ + +# define LOCALE_INIT MUTEX_INIT(&PL_locale_mutex) +# define LOCALE_TERM MUTEX_DESTROY(&PL_locale_mutex) + +# define LOCALE_LOCK MUTEX_LOCK(&PL_locale_mutex) +# define LOCALE_UNLOCK MUTEX_UNLOCK(&PL_locale_mutex) + /* Returns TRUE if the plain locale pragma without a parameter is in effect */ # define IN_LOCALE_RUNTIME cBOOL(CopHINTS_get(PL_curcop) & HINT_LOCALE) @@ -6037,6 +6044,10 @@ typedef struct am_table_short AMTS; # endif /* PERL_CORE or PERL_IN_XSUB_RE */ #else /* No locale usage */ +# define LOCALE_INIT +# define LOCALE_TERM +# define LOCALE_LOCK +# define LOCALE_UNLOCK # define IN_LOCALE_RUNTIME 0 # define IN_SOME_LOCALE_FORM_RUNTIME 0 # define IN_LOCALE_COMPILETIME 0 diff --git a/perlapi.h b/perlapi.h index 910f789..7aa4455 100644 --- a/perlapi.h +++ b/perlapi.h @@ -123,6 +123,8 @@ END_EXTERN_C #define PL_hints_mutex (*Perl_Ghints_mutex_ptr(NULL)) #undef PL_keyword_plugin #define PL_keyword_plugin (*Perl_Gkeyword_plugin_ptr(NULL)) +#undef PL_locale_mutex +#define PL_locale_mutex (*Perl_Glocale_mutex_ptr(NULL)) #undef PL_malloc_mutex #define PL_malloc_mutex (*Perl_Gmalloc_mutex_ptr(NULL)) #undef PL_mmap_page_size diff --git a/perlvars.h b/perlvars.h index 86a369e..5466294 100644 --- a/perlvars.h +++ b/perlvars.h @@ -99,6 +99,8 @@ PERLVARI(G, mmap_page_size, IV, 0) #if defined(USE_ITHREADS) PERLVAR(G, hints_mutex, perl_mutex) /* Mutex for refcounted he refcounting */ +PERLVAR(G, locale_mutex, perl_mutex) /* Mutex for setlocale() changing */ + #endif #ifdef DEBUGGING diff --git a/pod/perldelta.pod b/pod/perldelta.pod index 57cf7c1..4dcc7af 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -341,7 +341,10 @@ files in F<ext/> and F<lib/> are best summarized in L</Modules and Pragmata>. =item * -XXX +A race condition which occurred when computing C<"$!"> with threads +activated has been fixed. This showed up only on Darwin platforms. A +related problem on Cygwin platforms involving UTF-8 strings has also +been fixed. [perl #127708] =back diff --git a/symbian/symbianish.h b/symbian/symbianish.h index da5332c..80c580b 100644 --- a/symbian/symbianish.h +++ b/symbian/symbianish.h @@ -120,8 +120,10 @@ #define Mkdir(path,mode) mkdir((path),(mode)) #ifndef PERL_SYS_TERM_BODY -#define PERL_SYS_TERM_BODY() HINTS_REFCNT_TERM; OP_REFCNT_TERM; \ - PERLIO_TERM; MALLOC_TERM; CloseSTDLIB(); +#define PERL_SYS_TERM_BODY() HINTS_REFCNT_TERM; OP_REFCNT_TERM; \ + PERLIO_TERM; MALLOC_TERM; CloseSTDLIB(); \ + PERL_LOCALE_TERM + #endif #define BIT_BUCKET "NUL:" diff --git a/unixish.h b/unixish.h index 5b57138..e05cb6a 100644 --- a/unixish.h +++ b/unixish.h @@ -138,9 +138,10 @@ int afstat(int fd, struct stat *statb); #if defined(__amigaos4__) # define PERL_SYS_INIT_BODY(c,v) \ MALLOC_CHECK_TAINT2(*c,*v) PERL_FPU_INIT; PERLIO_INIT; MALLOC_INIT; amigaos4_init_fork_array(); amigaos4_init_environ_sema(); -# define PERL_SYS_TERM_BODY() \ - HINTS_REFCNT_TERM; OP_CHECK_MUTEX_TERM; \ - OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM; amigaos4_dispose_fork_array(); +# define PERL_SYS_TERM_BODY() \ + HINTS_REFCNT_TERM; OP_CHECK_MUTEX_TERM; \ + OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM; LOCALE_TERM; \ + amigaos4_dispose_fork_array(); #endif #ifndef PERL_SYS_INIT_BODY @@ -149,9 +150,10 @@ int afstat(int fd, struct stat *statb); #endif #ifndef PERL_SYS_TERM_BODY -# define PERL_SYS_TERM_BODY() \ - HINTS_REFCNT_TERM; OP_CHECK_MUTEX_TERM; \ - OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM; +# define PERL_SYS_TERM_BODY() \ + HINTS_REFCNT_TERM; OP_CHECK_MUTEX_TERM; \ + OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM; \ + LOCALE_TERM; #endif diff --git a/vms/vmsish.h b/vms/vmsish.h index cf1c9a8..1aea829 100644 --- a/vms/vmsish.h +++ b/vms/vmsish.h @@ -315,7 +315,8 @@ struct interp_intern { #define BIT_BUCKET "/dev/null" #define PERL_SYS_INIT_BODY(c,v) MALLOC_CHECK_TAINT2(*c,*v) vms_image_init((c),(v)); PERLIO_INIT; MALLOC_INIT -#define PERL_SYS_TERM_BODY() HINTS_REFCNT_TERM; OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM +#define PERL_SYS_TERM_BODY() HINTS_REFCNT_TERM; OP_REFCNT_TERM; \ + PERLIO_TERM; MALLOC_TERM; LOCALE_TERM #define dXSUB_SYS #define HAS_KILL #define HAS_WAIT diff --git a/win32/win32.c b/win32/win32.c index 651b97b..6ac73e2 100644 --- a/win32/win32.c +++ b/win32/win32.c @@ -4529,6 +4529,7 @@ Perl_win32_term(void) OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM; + LOCALE_TERM; #ifndef WIN32_NO_REGISTRY /* handles might be NULL, RegCloseKey then returns ERROR_INVALID_HANDLE but no point of checking and we can't die() at this point */ diff --git a/win32/wince.c b/win32/wince.c index bcc66c8..ce06481 100644 --- a/win32/wince.c +++ b/win32/wince.c @@ -2705,6 +2705,7 @@ Perl_win32_term(void) OP_REFCNT_TERM; PERLIO_TERM; MALLOC_TERM; + LOCALE_TERM; } void -- Perl5 Master Repository
