In perl.git, the branch blead has been updated <http://perl5.git.perl.org/perl.git/commitdiff/c8f77a9147a316e5b355e134b9cbff371ceb2f4b?hp=3eab96cadf1e1423d9c6e7051d14906c9d419edb>
- Log ----------------------------------------------------------------- commit c8f77a9147a316e5b355e134b9cbff371ceb2f4b Merge: 3eab96c c69a26e Author: Karl Williamson <[email protected]> Date: Sat Jan 4 13:35:33 2014 -0700 Merge LC_NUMERIC locale changes branch into blead LC_NUMERIC hasn't been implemented quite the same way as the other locale categories. And the implementation has been somewhat haphazard. The other categories have implementations where if you're not under locale you simply use different operations. That isn't possible with LC_NUMERIC, as it may need libc functions that are always subject to the current locale no matter what Perl thinks. There are two possible implemantation paths that come to my mind to deal with this. One is to keep correctly set the locale that the libc routines need, and switch to the C locale during those places where it shouldn't be used. The other way is the opposite, to keep things in the C locale generally, and switch when needed. Unfortunately the implementation (prior to this series of commits) used a combination of both possibilities. I am still unsure what the original intent was (not having spent the time to dig through the history), or even if there was a consistent intent. In any event, there has long been infrastructure that facilitates switching back and forth between the current underlying locale and the C locale. However this was not documented until now, and so it is not surprising that people who came later (including me) did not realize it existed, and reinvented things, inconsistently. What I've done here is move to the first implementation path mentioned above. I believe this is the one more likely to show up other bugs during the remainder of the 5.19 development cycle. I have changed and added to the infrastructure, so that it knows whether we should be in the C or the underlying locale, and switches/restores if and only if it is necessary. We can change to the other implementation path later with only minimal code changes. commit c69a26e800153396666dbd931fd30c3885104a2f Author: Karl Williamson <[email protected]> Date: Sat Dec 21 16:51:59 2013 -0700 APItest.xs: #include fakesdio.h This causes the printf's and other stdio functions in the file to instead be the PerlIO equivalents. We aren't trying to test libc after all. See http://markmail.org/message/h26bq75cxlfe3y7r M ext/XS-APItest/APItest.xs commit 5be9e8240b44cf7f6dd4e36e9c8362389aa84661 Author: Karl Williamson <[email protected]> Date: Mon Dec 23 22:03:46 2013 -0700 t/run/locale.t: White-space only Indent because of new block introduced in the previous commit M t/run/locale.t commit bc8ec7cc020d0562094a551b280fd3f32bf5eb04 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 16:25:02 2013 -0700 PATCH: [perl #120723] Setting LC_NUMERIC breaks parsing of constants This is the final patch for [perl #120723], and adds tests for it. LC_NUMERIC Locale handling was broken for code during the compilation phase, such as BEGIN {} blocks. This is because, for some reason, perl.c set LC_NUMERIC unconditionally back to the C locale right after locale initialization. I suspect that was to allow the core's parsing to not be affected by locale. However, earlier commits in this series have added code to change/restore the locale during sections of the parsing where this might matter, so this setting to the C locale is not needed. M perl.c M t/run/locale.t commit e4850248a4955dffbd4457e9114cd52662e3582e Author: Karl Williamson <[email protected]> Date: Mon Dec 16 22:40:26 2013 -0700 numeric.c: White-space only Indent and reflow to 79 columns as a result of the previous commit's adding a block around this code. M numeric.c commit f6dde82eadcc6685581969cf33932bba37fed9d1 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 23:15:25 2013 -0700 toke.c: Set locale for all scan_num() calls; restore instead of reset One call of Perl_scan_num changes the locale around it. However, this function is called in several places, including from outside the file. It is better to set the locale within scan_num() at the point where it matters. And, instead of setting the locale unconditionally, it is better to change it only if it needs to be changed, and restore it to the original. Otherwise the locale can be changed to something unexpected. M toke.c commit 90d6b40e1850c1b0849446bc02c3ffe9a376aaee Author: Karl Williamson <[email protected]> Date: Sat Jan 4 18:53:17 2014 +0100 vutil.c: Use existing macros instead of reinventing them If there had been documentation referring to these macros, I would have known they existed instead of reinventing them (not as well as the originals). M vutil.c commit 371d5d44b054ca1540da57f7be95194f1d92f449 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 23:04:40 2013 -0700 POSIX:strtod() should restore the locale it changed Prior to this commit, the locale remained as strtod() set it to. I could not find a case where this actually was a problem, as the other code is good about checking for and changing the locale where needed. But uses of atoi(), strtol() in locales where there are spaces in numbers likely would break. M ext/POSIX/POSIX.xs M ext/POSIX/lib/POSIX.pm M perl.h M t/run/locale.t commit a2287a13f4e99299ebd9c06f9f98e1bdc7f5089e Author: Karl Williamson <[email protected]> Date: Mon Dec 16 22:34:19 2013 -0700 Use new macros to make sure LC_NUMERIC is correctly set This uses the macros added in the previous commit to make sure the current LC_NUMERIC locale is correct during the operation being done; restoring it to its prior condition afterwards. Outside of 'use locale' the locale should be C; inside it should be the underlying default locale. The macros handle the whole thing. In most of the places here, the code was trying to do what the macros do more elegantly, but there are some additional places where we set the locale correctly around an operation that is affected by it. M numeric.c M pp_ctl.c M sv.c commit 8a11fac97cbf1b0668b5ba645f1fb2709d10f82a Author: Karl Williamson <[email protected]> Date: Mon Dec 16 21:47:27 2013 -0700 perl.h: Add macros and comments These macros are used in making sure the current locale is the correct one for the circumstances. I'm not a fan of this type of code generation macro, but this just extends what is already there, and aren't all that complicated. M perl.h commit b34856cbbf2e4df8b63cb86c031aadea8f590219 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 22:55:43 2013 -0700 perl.h: Revise another locale setting macro We generally don't want to switch to the default underlying locale unless we are in the scope of some form of 'use locale'. Prior to this commit, this code did not allow the switch for 'use locale ":not_characters"'. M perl.h M t/run/locale.t commit 7cdd8b31c7e76d6cb7e1c568f5c9de585b44b4e7 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 22:50:28 2013 -0700 perl.h: Revise a locale setting macro This macro toggles to the C locale. It should not depend on being in the scope of 'use locale' to do that, so remove the check. I couldn't figure out a test case for this, but I'm pretty sure there is a some convoluted scheme that this change averts a bug from. M perl.h commit b07fffd1fe5d47918a00b18619bc086060f79273 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 19:00:15 2013 -0700 locale.c: Avoid writing libc static storage I don't believe this code was causing any problem, but it can overwrite static storage returned by setlocale(). It's safer to create a copy first. M locale.c commit 94d52e62247721b810f3584a444b7d4b243b685b Author: Karl Williamson <[email protected]> Date: Wed Dec 11 18:04:47 2013 -0700 Hide some undocumented functions from perlapi These functions should not be called from any other places than they are now. They have been marked in the public API as undocumented. I presume they are there because they are called from various parts of the Perl core, so can't be static. But this suppresses them from being listed so people won't be tempted to use them. M embed.fnc commit 0d071d5214c9ee1fd0f99bcaa8472bc5174d8efe Author: Karl Williamson <[email protected]> Date: Wed Dec 11 16:53:25 2013 -0700 locale.c: Add comments This documents much of what I learned about how things work while tracking down [perl #120723]. M locale.c commit e19f01cbf528e8d938edaea7a80215352e24c44e Author: Karl Williamson <[email protected]> Date: Thu Dec 12 21:41:10 2013 -0700 locale.c: White-space only Outdent code removed from a block by the previous commit M locale.c commit 6959d69d33a0e59f62289d1c27a5c094387a6933 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 14:30:45 2013 -0700 locale.c: Always set state variables for a new locale This function is called when a new underlying LC_NUMERIC locale has been set. If that locale is the same as the current underlying one, some setup is skipped. However, prior to this commit, more was skipped than should have been. The reason is that even if the underlying locale is the same, it could be that LC_NUMERIC has been toggled to the "C" locale, and so the information could be inconsistent. By always setting the information, we ensure consistency. This commit ia a portion of the fix for [perl #120723]. Tests will be added with the final commit for it. M locale.c commit 87755cd9fa6144875541cb560b7f0487e52c8626 Author: Karl Williamson <[email protected]> Date: Wed Dec 11 16:54:49 2013 -0700 perl.h: Move some macro definitions This places related definitions together in the file. M perl.h ----------------------------------------------------------------------- Summary of changes: embed.fnc | 10 ++--- ext/POSIX/POSIX.xs | 3 +- ext/POSIX/lib/POSIX.pm | 2 +- ext/XS-APItest/APItest.xs | 1 + locale.c | 93 ++++++++++++++++++++++++++++++++++++++--------- numeric.c | 53 +++++++++++++++------------ perl.c | 1 - perl.h | 72 ++++++++++++++++++++++++++++++++---- pp_ctl.c | 4 +- sv.c | 48 +++++------------------- t/run/locale.t | 79 ++++++++++++++++++++++++++++++++++------ toke.c | 10 +---- vutil.c | 15 +------- 13 files changed, 262 insertions(+), 129 deletions(-) diff --git a/embed.fnc b/embed.fnc index 18610ae..d0d980c 100644 --- a/embed.fnc +++ b/embed.fnc @@ -1068,11 +1068,11 @@ ApdO |AV* |get_av |NN const char *name|I32 flags ApdO |HV* |get_hv |NN const char *name|I32 flags ApdO |CV* |get_cv |NN const char* name|I32 flags Apd |CV* |get_cvn_flags |NN const char* name|STRLEN len|I32 flags -ApO |int |init_i18nl10n |int printwarn -ApO |int |init_i18nl14n |int printwarn -ApO |void |new_collate |NULLOK const char* newcoll -ApO |void |new_ctype |NN const char* newctype -ApO |void |new_numeric |NULLOK const char* newcoll +ApOM |int |init_i18nl10n |int printwarn +ApOM |int |init_i18nl14n |int printwarn +ApOM |void |new_collate |NULLOK const char* newcoll +ApOM |void |new_ctype |NN const char* newctype +ApOM |void |new_numeric |NULLOK const char* newcoll Ap |void |set_numeric_local Ap |void |set_numeric_radix Ap |void |set_numeric_standard diff --git a/ext/POSIX/POSIX.xs b/ext/POSIX/POSIX.xs index 3e77eb4..08e459c 100644 --- a/ext/POSIX/POSIX.xs +++ b/ext/POSIX/POSIX.xs @@ -1482,7 +1482,7 @@ strtod(str) double num; char *unparsed; PPCODE: - SET_NUMERIC_LOCAL(); + STORE_NUMERIC_STANDARD_FORCE_LOCAL(); num = strtod(str, &unparsed); PUSHs(sv_2mortal(newSVnv(num))); if (GIMME == G_ARRAY) { @@ -1492,6 +1492,7 @@ strtod(str) else PUSHs(&PL_sv_undef); } + RESTORE_NUMERIC_STANDARD(); void strtol(str, base = 0) diff --git a/ext/POSIX/lib/POSIX.pm b/ext/POSIX/lib/POSIX.pm index 0dd8475..d0bc3fd 100644 --- a/ext/POSIX/lib/POSIX.pm +++ b/ext/POSIX/lib/POSIX.pm @@ -4,7 +4,7 @@ use warnings; our ($AUTOLOAD, %SIGRT); -our $VERSION = '1.37'; +our $VERSION = '1.38'; require XSLoader; diff --git a/ext/XS-APItest/APItest.xs b/ext/XS-APItest/APItest.xs index e352195..a4b91f6 100644 --- a/ext/XS-APItest/APItest.xs +++ b/ext/XS-APItest/APItest.xs @@ -2,6 +2,7 @@ #include "EXTERN.h" #include "perl.h" #include "XSUB.h" +#include "fakesdio.h" # Causes us to use PerlIO below typedef SV *SVREF; typedef PTR_TBL_t *XS__APItest__PtrTable; diff --git a/locale.c b/locale.c index 31aa592..2bae987 100644 --- a/locale.c +++ b/locale.c @@ -22,6 +22,14 @@ /* utility functions for handling locale-specific stuff like what * character represents the decimal point. + * + * All C programs have an underlying locale. Perl generally doesn't pay any + * attention to it except within the scope of a 'use locale'. For most + * categories, it accomplishes this by just using different operations if it is + * in such scope than if not. However, various libc functions called by Perl + * are affected by the LC_NUMERIC category, so there are macros in perl.h that + * are used to toggle between the current locale and the C locale depending on + * the desired behavior of those functions at the moment. */ #include "EXTERN.h" @@ -37,7 +45,8 @@ #ifdef USE_LOCALE /* - * Standardize the locale name from a string returned by 'setlocale'. + * Standardize the locale name from a string returned by 'setlocale', possibly + * modifying that string. * * The typical return value of setlocale() is either * (1) "xx_YY" if the first argument of setlocale() is not LC_ALL @@ -112,13 +121,39 @@ Perl_set_numeric_radix(pTHX) #endif /* USE_LOCALE_NUMERIC */ } -/* - * Set up for a new numeric locale. - */ void Perl_new_numeric(pTHX_ const char *newnum) { #ifdef USE_LOCALE_NUMERIC + + /* Called after all libc setlocale() calls affecting LC_NUMERIC, to tell + * core Perl this and that 'newnum' is the name of the new locale. + * It installs this locale as the current underlying default. + * + * The default locale and the C locale can be toggled between by use of the + * set_numeric_local() and set_numeric_standard() functions, which should + * probably not be called directly, but only via macros like + * SET_NUMERIC_STANDARD() in perl.h. + * + * The toggling is necessary mainly so that a non-dot radix decimal point + * character can be output, while allowing internal calculations to use a + * dot. + * + * This sets several interpreter-level variables: + * PL_numeric_name The default locale's name: a copy of 'newnum' + * PL_numeric_local A boolean indicating if the toggled state is such + * that the current locale is the default locale + * PL_numeric_standard A boolean indicating if the toggled state is such + * that the current locale is the C locale + * Note that both of the last two variables can be true at the same time, + * if the underlying locale is C. (Toggling is a no-op under these + * circumstances.) + * + * Any code changing the locale (outside this file) should use + * POSIX::setlocale, which calls this function. Therefore this function + * should be called directly only from this file and from + * POSIX::setlocale() */ + char *save_newnum; dVAR; @@ -134,15 +169,13 @@ Perl_new_numeric(pTHX_ const char *newnum) if (! PL_numeric_name || strNE(PL_numeric_name, save_newnum)) { Safefree(PL_numeric_name); PL_numeric_name = save_newnum; - PL_numeric_standard = ((*save_newnum == 'C' && save_newnum[1] == '\0') - || strEQ(save_newnum, "POSIX")); - PL_numeric_local = TRUE; - set_numeric_radix(); - } - else { - Safefree(save_newnum); } + PL_numeric_standard = ((*save_newnum == 'C' && save_newnum[1] == '\0') + || strEQ(save_newnum, "POSIX")); + PL_numeric_local = TRUE; + set_numeric_radix(); + #endif /* USE_LOCALE_NUMERIC */ } @@ -152,6 +185,10 @@ Perl_set_numeric_standard(pTHX) #ifdef USE_LOCALE_NUMERIC dVAR; + /* Toggle the LC_NUMERIC locale to C, if not already there. Probably + * should use the macros like SET_NUMERIC_STANDARD() in perl.h instead of + * calling this directly. */ + if (! PL_numeric_standard) { setlocale(LC_NUMERIC, "C"); PL_numeric_standard = TRUE; @@ -168,6 +205,10 @@ Perl_set_numeric_local(pTHX) #ifdef USE_LOCALE_NUMERIC dVAR; + /* Toggle the LC_NUMERIC locale to the current underlying default, if not + * already there. Probably should use the macros like SET_NUMERIC_LOCAL() + * in perl.h instead of calling this directly. */ + if (! PL_numeric_local) { setlocale(LC_NUMERIC, PL_numeric_name); PL_numeric_standard = FALSE; @@ -185,6 +226,18 @@ void Perl_new_ctype(pTHX_ const char *newctype) { #ifdef USE_LOCALE_CTYPE + + /* Called after all libc setlocale() calls affecting LC_CTYPE, to tell + * core Perl this and that 'newctype' is the name of the new locale. + * + * This function sets up the folding arrays for all 256 bytes, assuming + * that tofold() is tolc() since fold case is not a concept in POSIX, + * + * Any code changing the locale (outside this file) should use + * POSIX::setlocale, which calls this function. Therefore this function + * should be called directly only from this file and from + * POSIX::setlocale() */ + dVAR; UV i; @@ -205,13 +258,19 @@ Perl_new_ctype(pTHX_ const char *newctype) PERL_UNUSED_CONTEXT; } -/* - * Set up for a new collation locale. - */ void Perl_new_collate(pTHX_ const char *newcoll) { #ifdef USE_LOCALE_COLLATE + + /* Called after all libc setlocale() calls affecting LC_COLLATE, to tell + * core Perl this and that 'newcoll' is the name of the new locale. + * + * Any code changing the locale (outside this file) should use + * POSIX::setlocale, which calls this function. Therefore this function + * should be called directly only from this file and from + * POSIX::setlocale() */ + dVAR; if (! newcoll) { @@ -628,18 +687,18 @@ S_is_cur_LC_category_utf8(pTHX_ int category) #endif /* First dispose of the trivial cases */ - save_input_locale = stdize_locale(setlocale(category, NULL)); + save_input_locale = setlocale(category, NULL); if (! save_input_locale) { return FALSE; /* XXX maybe should croak */ } + save_input_locale = stdize_locale(savepv(save_input_locale)); if ((*save_input_locale == 'C' && save_input_locale[1] == '\0') || strEQ(save_input_locale, "POSIX")) { + Safefree(save_input_locale); return FALSE; } - save_input_locale = savepv(save_input_locale); - #if defined(HAS_NL_LANGINFO) && defined(CODESET) && defined(USE_LOCALE_CTYPE) { /* Next try nl_langinfo if available */ diff --git a/numeric.c b/numeric.c index b5144f4..d431728 100644 --- a/numeric.c +++ b/numeric.c @@ -853,31 +853,36 @@ Perl_my_atof(pTHX_ const char* s) PERL_ARGS_ASSERT_MY_ATOF; - if (PL_numeric_local && PL_numeric_radix_sv && IN_SOME_LOCALE_FORM) { - const char *standard = NULL, *local = NULL; - bool use_standard_radix; - - /* Look through the string for the first thing that looks like a - * decimal point: either the value in the current locale or the - * standard fallback of '.'. The one which appears earliest in the - * input string is the one that we should have atof look for. Note that - * we have to determine this beforehand because on some systems, - * Perl_atof2 is just a wrapper around the system's atof. */ - standard = strchr(s, '.'); - local = strstr(s, SvPV_nolen(PL_numeric_radix_sv)); - - use_standard_radix = standard && (!local || standard < local); - - if (use_standard_radix) - SET_NUMERIC_STANDARD(); - - Perl_atof2(s, x); - - if (use_standard_radix) - SET_NUMERIC_LOCAL(); + { + DECLARE_STORE_LC_NUMERIC_SET_TO_NEEDED(); + if (PL_numeric_local && PL_numeric_radix_sv && IN_SOME_LOCALE_FORM) { + const char *standard = NULL, *local = NULL; + bool use_standard_radix; + + /* Look through the string for the first thing that looks like a + * decimal point: either the value in the current locale or the + * standard fallback of '.'. The one which appears earliest in the + * input string is the one that we should have atof look for. Note + * that we have to determine this beforehand because on some + * systems, Perl_atof2 is just a wrapper around the system's atof. + * */ + standard = strchr(s, '.'); + local = strstr(s, SvPV_nolen(PL_numeric_radix_sv)); + + use_standard_radix = standard && (!local || standard < local); + + if (use_standard_radix) + SET_NUMERIC_STANDARD(); + + Perl_atof2(s, x); + + if (use_standard_radix) + SET_NUMERIC_LOCAL(); + } + else + Perl_atof2(s, x); + RESTORE_LC_NUMERIC(); } - else - Perl_atof2(s, x); #else Perl_atof2(s, x); #endif diff --git a/perl.c b/perl.c index f0bfac1..5590761 100644 --- a/perl.c +++ b/perl.c @@ -254,7 +254,6 @@ perl_construct(pTHXx) STATUS_ALL_SUCCESS; init_i18nl10n(1); - SET_NUMERIC_STANDARD(); #if defined(LOCAL_PATCH_COUNT) PL_localpatches = local_patches; /* For possible -v */ diff --git a/perl.h b/perl.h index bb2ed3d9..b6e0c3e 100644 --- a/perl.h +++ b/perl.h @@ -5212,12 +5212,6 @@ typedef struct am_table_short AMTS; #ifdef USE_LOCALE_NUMERIC -#define SET_NUMERIC_STANDARD() \ - set_numeric_standard(); - -#define SET_NUMERIC_LOCAL() \ - set_numeric_local(); - /* Returns non-zero If the plain locale pragma without a parameter is in effect */ #define IN_LOCALE_RUNTIME (CopHINTS_get(PL_curcop) & HINT_LOCALE) @@ -5236,12 +5230,68 @@ typedef struct am_table_short AMTS; (IN_PERL_COMPILETIME ? IN_SOME_LOCALE_FORM_COMPILETIME \ : IN_SOME_LOCALE_FORM_RUNTIME) +/* These macros are for toggling between the underlying locale (LOCAL) and the + * C locale. */ + +/* The first set makes sure that the locale is set to C unless within a 'use + * locale's scope; otherwise to the default locale. A function pointer is + * used, which can be declared separately by + * DECLARATION_FOR_STORE_LC_NUMERIC_SET_TO_NEEDED, followed by the actual + * setting (using STORE_LC_NUMERIC_SET_TO_NEEDED()), or the two can be combined + * into one call DECLARE_STORE_LC_NUMERIC_SET_TO_NEEDED(). + * RESTORE_LC_NUMERIC() in all cases restores the locale to what it was before + * these were called */ + +#define DECLARATION_FOR_STORE_LC_NUMERIC_SET_TO_NEEDED \ + void (*_restore_LC_NUMERIC_function)(pTHX) = NULL; + +#define STORE_LC_NUMERIC_SET_TO_NEEDED() \ + if (IN_SOME_LOCALE_FORM) { \ + if (! PL_numeric_local) { \ + SET_NUMERIC_LOCAL(); \ + _restore_LC_NUMERIC_function = &Perl_set_numeric_standard; \ + } \ + } \ + else { \ + if (! PL_numeric_standard) { \ + SET_NUMERIC_STANDARD(); \ + _restore_LC_NUMERIC_function = &Perl_set_numeric_local; \ + } \ + } + +#define DECLARE_STORE_LC_NUMERIC_SET_TO_NEEDED() \ + DECLARATION_FOR_STORE_LC_NUMERIC_SET_TO_NEEDED; \ + STORE_LC_NUMERIC_SET_TO_NEEDED(); + +#define RESTORE_LC_NUMERIC() \ + if (_restore_LC_NUMERIC_function) { \ + _restore_LC_NUMERIC_function(aTHX); \ + } + +/* The next two macros set unconditionally. These should be rarely used, and + * only after being sure that this is what is needed */ +#define SET_NUMERIC_STANDARD() \ + set_numeric_standard(); + +#define SET_NUMERIC_LOCAL() \ + set_numeric_local(); + +/* The rest of these LC_NUMERIC macros toggle to one or the other state, with + * the RESTORE_foo ones called to switch back, but only if need be */ #define STORE_NUMERIC_LOCAL_SET_STANDARD() \ - bool was_local = PL_numeric_local && IN_LOCALE; \ + bool was_local = PL_numeric_local; \ if (was_local) SET_NUMERIC_STANDARD(); +/* Doesn't change to underlying locale unless within the scope of some form of + * 'use locale'. This is the usual desired behavior. */ #define STORE_NUMERIC_STANDARD_SET_LOCAL() \ - bool was_standard = PL_numeric_standard && IN_LOCALE; \ + bool was_standard = PL_numeric_standard && IN_SOME_LOCALE_FORM; \ + if (was_standard) SET_NUMERIC_LOCAL(); + +/* Rarely, we want to change to the underlying locale even outside of 'use + * locale'. This is principally in the POSIX:: functions */ +#define STORE_NUMERIC_STANDARD_FORCE_LOCAL() \ + bool was_standard = PL_numeric_standard; \ if (was_standard) SET_NUMERIC_LOCAL(); #define RESTORE_NUMERIC_LOCAL() \ @@ -5259,8 +5309,14 @@ typedef struct am_table_short AMTS; #define IS_NUMERIC_RADIX(a, b) (0) #define STORE_NUMERIC_LOCAL_SET_STANDARD() /**/ #define STORE_NUMERIC_STANDARD_SET_LOCAL() /**/ +#define STORE_NUMERIC_STANDARD_FORCE_LOCAL() #define RESTORE_NUMERIC_LOCAL() /**/ #define RESTORE_NUMERIC_STANDARD() /**/ +#define DECLARATION_FOR_STORE_LC_NUMERIC_SET_TO_NEEDED +#define STORE_LC_NUMERIC_SET_TO_NEEDED() +#define DECLARE_STORE_LC_NUMERIC_SET_TO_NEEDED() +#define RESTORE_LC_NUMERIC() + #define Atof my_atof #define IN_LOCALE_RUNTIME 0 #define IN_LOCALE_COMPILETIME 0 diff --git a/pp_ctl.c b/pp_ctl.c index 7236921..d47e983 100644 --- a/pp_ctl.c +++ b/pp_ctl.c @@ -836,13 +836,13 @@ PP(pp_formline) } /* Formats aren't yet marked for locales, so assume "yes". */ { - STORE_NUMERIC_STANDARD_SET_LOCAL(); + DECLARE_STORE_LC_NUMERIC_SET_TO_NEEDED(); arg &= ~(FORM_NUM_POINT|FORM_NUM_BLANK); /* we generate fmt ourselves so it is safe */ GCC_DIAG_IGNORE(-Wformat-nonliteral); my_snprintf(t, SvLEN(PL_formtarget) - (t - SvPVX(PL_formtarget)), fmt, (int) fieldsize, (int) arg, value); GCC_DIAG_RESTORE; - RESTORE_NUMERIC_STANDARD(); + RESTORE_LC_NUMERIC(); } t += fieldsize; break; diff --git a/sv.c b/sv.c index 93d6a1a..676340b 100644 --- a/sv.c +++ b/sv.c @@ -2957,31 +2957,19 @@ Perl_sv_2pv_flags(pTHX_ SV *const sv, STRLEN *const lp, const I32 flags) V_Gconvert(SvNVX(sv), NV_DIG, 0, s); SvPOK_on(sv); #else - /* Gconvert always uses the current locale. That's the right thing - * to do if we're supposed to be using locales. But otherwise, we - * want the result to be based on the C locale, so we need to - * change to the C locale during the Gconvert and then change back. - * But if we're already in the C locale (PL_numeric_standard is - * TRUE in that case), no need to do any changing */ - if (PL_numeric_standard || IN_SOME_LOCALE_FORM_RUNTIME) { + { + DECLARE_STORE_LC_NUMERIC_SET_TO_NEEDED(); V_Gconvert(SvNVX(sv), NV_DIG, 0, s); /* If the radix character is UTF-8, and actually is in the * output, turn on the UTF-8 flag for the scalar */ - if (! PL_numeric_standard + if (PL_numeric_local && PL_numeric_radix_sv && SvUTF8(PL_numeric_radix_sv) && instr(s, SvPVX_const(PL_numeric_radix_sv))) { SvUTF8_on(sv); } - } - else { - char *loc = savepv(setlocale(LC_NUMERIC, NULL)); - setlocale(LC_NUMERIC, "C"); - V_Gconvert(SvNVX(sv), NV_DIG, 0, s); - setlocale(LC_NUMERIC, loc); - Safefree(loc); - + RESTORE_LC_NUMERIC(); } /* We don't call SvPOK_on(), because it may come to pass that the @@ -10423,9 +10411,8 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p char ebuf[IV_DIG * 4 + NV_DIG + 32]; /* large enough for "%#.#f" --chip */ /* what about long double NVs? --jhi */ -#ifdef USE_LOCALE_NUMERIC - SV* oldlocale = NULL; -#endif + + DECLARATION_FOR_STORE_LC_NUMERIC_SET_TO_NEEDED; PERL_ARGS_ASSERT_SV_VCATPVFN_FLAGS; PERL_UNUSED_ARG(maybe_tainted); @@ -10478,6 +10465,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p a Configure test for this. */ if (digits && digits < sizeof(ebuf) - NV_DIG - 10) { /* 0, point, slack */ + STORE_LC_NUMERIC_SET_TO_NEEDED(); V_Gconvert(nv, (int)digits, 0, ebuf); sv_catpv_nomg(sv, ebuf); if (*ebuf) /* May return an empty string for digits==0 */ @@ -11338,6 +11326,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p /* See earlier comment about buggy Gconvert when digits, aka precis is 0 */ if ( c == 'g' && precis) { + STORE_LC_NUMERIC_SET_TO_NEEDED(); V_Gconvert((NV)nv, (int)precis, 0, PL_efloatbuf); /* May return an empty string for digits==0 */ if (*PL_efloatbuf) { @@ -11387,19 +11376,7 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p * where printf() taints but print($float) doesn't. * --jhi */ -#ifdef USE_LOCALE_NUMERIC - if (! PL_numeric_standard && ! IN_SOME_LOCALE_FORM) { - - /* We use a mortal SV, so that any failures (such as if - * warnings are made fatal) won't leak */ - char *oldlocale_string = setlocale(LC_NUMERIC, NULL); - oldlocale = newSVpvn_flags(oldlocale_string, - strlen(oldlocale_string), - SVs_TEMP); - PL_numeric_standard = TRUE; - setlocale(LC_NUMERIC, "C"); - } -#endif + STORE_LC_NUMERIC_SET_TO_NEEDED(); /* hopefully the above makes ptr a very constrained format * that is safe to use, even though it's not literal */ @@ -11581,13 +11558,8 @@ Perl_sv_vcatpvfn_flags(pTHX_ SV *const sv, const char *const pat, const STRLEN p } SvTAINT(sv); -#ifdef USE_LOCALE_NUMERIC /* Done outside loop, so don't have to save/restore + RESTORE_LC_NUMERIC(); /* Done outside loop, so don't have to save/restore each iteration. */ - if (oldlocale) { - setlocale(LC_NUMERIC, SvPVX(oldlocale)); - PL_numeric_standard = FALSE; - } -#endif } /* ========================================================================= diff --git a/t/run/locale.t b/t/run/locale.t index 0ecf9a9..5a6d875 100644 --- a/t/run/locale.t +++ b/t/run/locale.t @@ -20,6 +20,7 @@ BEGIN { } use Config; my $have_setlocale = $Config{d_setlocale} eq 'define'; +my $have_strtod = $Config{d_strtod} eq 'define'; $have_setlocale = 0 if $@; # Visual C's CRT goes silly on strings of the form "en_US.ISO8859-1" # and mingw32 uses said silly CRT @@ -47,17 +48,20 @@ fresh_perl_is("for (qw(@locales)) {\n" . <<'EOF', EOF "", {}, "no locales where LC_NUMERIC breaks"); -fresh_perl_is("for (qw(@locales)) {\n" . <<'EOF', - use POSIX qw(locale_h); - use locale; - my $in = 4.2; - my $s = sprintf "%g", $in; # avoid any constant folding bugs - next if $s eq "4.2"; - print "$_ $s\n"; -} +{ + local $ENV{LC_NUMERIC}; + local $ENV{LC_ALL}; # so it never overrides LC_NUMERIC + fresh_perl_is("for (qw(@locales)) {\n" . <<'EOF', + use POSIX qw(locale_h); + use locale; + my $in = 4.2; + my $s = sprintf "%g", $in; # avoid any constant folding bugs + next if $s eq "4.2"; + print "$_ $s\n"; + } EOF - "", {}, "LC_NUMERIC without setlocale() has no effect in any locale"); - + "", {}, "LC_NUMERIC without environment nor setlocale() has no effect in any locale"); +} # try to find out a locale where LC_NUMERIC makes a difference my $original_locale = setlocale(LC_NUMERIC); @@ -113,6 +117,18 @@ EOF } { + fresh_perl_is(<<'EOF', $difference, {}, +use locale ":not_characters"; +format STDOUT = +@.# +4.179 +. +write; +EOF + "format() looks at LC_NUMERIC with 'use locale \":not_characters\"'"); + } + + { fresh_perl_is(<<'EOF', "4.2", {}, format STDOUT = @.# @@ -181,6 +197,33 @@ EOF "sprintf() and printf() look at LC_NUMERIC regardless of constant folding"); } + for ($different) { + local $ENV{LC_NUMERIC} = $_; + local $ENV{LC_ALL}; # so it never overrides LC_NUMERIC + fresh_perl_is(<<"EOF", + use POSIX qw(locale_h); + + BEGIN { setlocale(LC_NUMERIC, \"$_\"); }; + setlocale(LC_ALL, "C"); + use 5.008; + print setlocale(LC_NUMERIC); +EOF + "C", { }, + "No compile error on v-strings when setting the locale to non-dot radix at compile time when default environment has non-dot radix"); + } + + for ($different) { + local $ENV{LC_NUMERIC} = $_; + local $ENV{LC_ALL}; # so it never overrides LC_NUMERIC + fresh_perl_is(<<"EOF", + use POSIX qw(locale_h); + + BEGIN { print setlocale(LC_NUMERIC), "\n"; }; +EOF + $_, { }, + "Passed in LC_NUMERIC is valid at compilation time"); + } + unless ($comma) { skip("no locale available where LC_NUMERIC is a comma", 2); } @@ -209,8 +252,22 @@ EOF print \$i, "\n"; EOF "1,5\n2,5", {}, "Can do math when radix is a comma"); # [perl 115800] + + unless ($have_strtod) { + skip("no strtod()", 1); + } + else { + fresh_perl_is(<<"EOF", + use POSIX; + POSIX::setlocale(POSIX::LC_NUMERIC(),"$comma"); + my \$one_point_5 = POSIX::strtod("1,5"); + \$one_point_5 =~ s/0+\$//; # Remove any trailing zeros + print \$one_point_5, "\n"; +EOF + "1.5", {}, "POSIX::strtod() uses underlying locale"); + } } } # SKIP -sub last { 11 } +sub last { 15 } diff --git a/toke.c b/toke.c index 8ac0f31..f8ebf53 100644 --- a/toke.c +++ b/toke.c @@ -2407,15 +2407,7 @@ S_force_version(pTHX_ char *s, int guessing) #endif if (*d == ';' || isSPACE(*d) || *d == '{' || *d == '}' || !*d) { SV *ver; -#ifdef USE_LOCALE_NUMERIC - char *loc = savepv(setlocale(LC_NUMERIC, NULL)); - setlocale(LC_NUMERIC, "C"); -#endif s = scan_num(s, &pl_yylval); -#ifdef USE_LOCALE_NUMERIC - setlocale(LC_NUMERIC, loc); - Safefree(loc); -#endif version = pl_yylval.opval; ver = cSVOPx(version)->op_sv; if (SvPOK(ver) && !SvNIOK(ver)) { @@ -11330,9 +11322,11 @@ Perl_scan_num(pTHX_ const char *start, YYSTYPE* lvalp) floatit = TRUE; } if (floatit) { + STORE_NUMERIC_LOCAL_SET_STANDARD(); /* terminate the string */ *d = '\0'; nv = Atof(PL_tokenbuf); + RESTORE_NUMERIC_LOCAL(); sv = newSVnv(nv); } diff --git a/vutil.c b/vutil.c index 06680dd..6cbfc72 100644 --- a/vutil.c +++ b/vutil.c @@ -564,13 +564,7 @@ Perl_upg_version(pTHX_ SV *ver, bool qv) char tbuf[64]; SV *sv = SvNVX(ver) > 10e50 ? newSV(64) : 0; char *buf; -#ifdef USE_LOCALE_NUMERIC - char *loc = NULL; - if (! PL_numeric_standard) { - loc = savepv(setlocale(LC_NUMERIC, NULL)); - setlocale(LC_NUMERIC, "C"); - } -#endif + STORE_NUMERIC_LOCAL_SET_STANDARD(); if (sv) { Perl_sv_catpvf(aTHX_ sv, "%.9"NVff, SvNVX(ver)); len = SvCUR(sv); @@ -580,12 +574,7 @@ Perl_upg_version(pTHX_ SV *ver, bool qv) len = my_snprintf(tbuf, sizeof(tbuf), "%.9"NVff, SvNVX(ver)); buf = tbuf; } -#ifdef USE_LOCALE_NUMERIC - if (loc) { - setlocale(LC_NUMERIC, loc); - Safefree(loc); - } -#endif + RESTORE_NUMERIC_LOCAL(); while (buf[len-1] == '0' && len > 0) len--; if ( buf[len-1] == '.' ) len--; /* eat the trailing decimal */ version = savepvn(buf, len); -- Perl5 Master Repository
