On 3/28/19 8:03 AM, Ingo Schwarze wrote:
Hi Ted and Karl,
Ted Unangst wrote on Wed, Mar 27, 2019 at 06:03:29PM -0400:
Karl Williamson wrote:
Ted Unangst wrote:
This should fix things.
There is no bug and nothing to fix.
Our existing implementation fully conforms to the POSIX specification.
POSIX explicitly says:
It is unspecified whether the locale object pointed to by base
shall be modified, or freed and a new locale object created.
I can see how you might be able to argue for your interpretation. I
believe the wording in the spec is poor in this case (and it isn't the
only such place). I, and every other implementer takes the above text
to mean, that it's up to the implementation as to how to combine 'base'
with the new locale specified by the other parameters. 'base' can be
modified and returned, or 'base' can be freed with some new locale which
is the combination of 'base' and the other parameters. I don't take
that wording to mean that 'base' can be irrelevant. That can't be the
intent, as that would mean wildly unportable code, and no way for the
program to specify an update to a category in isolation from the others.
The errata page yields a 404.
There is further support for your position later on when it says "If a
completely new locale object is created, the data for all sections not
requested by category_mask shall be taken from the default locale." I
think they mean a "completely new locale" where there was no component
from 'base'.
But, it doesn't really matter here, since you're planning to change in
the interests of portability. Thank you from me, us, and everyone who
wants to use this function on other distros.
I chose to take the second option: free the object pointed to by
base and create a new locale object, because that is the simpler
one of the two conforming possibilities, and we generally prefer
simplicity. Note that nothing in POSIX requires copying anything
from base when creating a new locale object.
That said, if you think it helps compatibility with other
implementations, i'm OK with changing the behaviour and implementing
the other conforming possibility, i.e. modify the object pointed
to by base instead. While the complication of the code is minimal,
the manual page requires substantial rewriting and becomes noticeably
harder to understand, but oh well, if it improves compatibility...
As you already observed, the first patch sent by Ted was wrong.
The second patch is correct, by i dislike the behaviour change for
the case base == LC_GLOBAL_LOCALE and for the case of an invalid
base. Yes, POSIX explicity stipulates those two cases result in
undefined behaviour, but returning invalid results from newlocale(3)
is not nice. I'd much prefer to always return a valid locale, even
for input causing undefined behaviour.
In any case, in the commit message, do not call it a "bug fix",
describe it as a change for compatibility with other systems.
Ted, feel free to either commit this version or tell me to commit.
Yours,
Ingo
Index: locale/newlocale.3
===================================================================
RCS file: /cvs/src/lib/libc/locale/newlocale.3,v
retrieving revision 1.1
diff -u -p -r1.1 newlocale.3
--- locale/newlocale.3 5 Sep 2017 03:16:13 -0000 1.1
+++ locale/newlocale.3 28 Mar 2019 13:56:37 -0000
@@ -46,11 +46,23 @@ creates a new locale object for use with
and with many functions that accept
.Vt locale_t
arguments.
+Locale categories not contained in the
+.Fa mask
+are copied from
+.Fa oldloc
+to the new locale object, or from the
+.Qq C
+locale if
+.Fa oldloc
+is
+.Po Vt locale_t Pc Ns 0 .
.Pp
On
.Ox ,
+.Fa locname
+only affects the return value if
.Fa mask
-is only meaningful if it includes
+includes
.Dv LC_CTYPE_MASK ,
and
.Fa locname
@@ -60,18 +72,11 @@ or
.Qq POSIX ,
if it ends with
.Qq .UTF-8 ,
-or if it is an empty string; otherwise,
-.Fn newlocale
-always returns the C locale.
-.Pp
-On
-.Ox ,
-.Fn newlocale
-ignores
-.Fa oldloc ,
-and passing
-.Po Vt locale_t Pc Ns 0
-is recommended.
+or if it is an empty string.
+Other
+.Fa locname
+arguments have the same effect as
+.Qq C .
.Pp
The function
.Fn duplocale
@@ -159,3 +164,11 @@ These functions conform to
.Sh HISTORY
These functions have been available since
.Ox 6.2 .
+.Sh CAVEATS
+Calling
+.Fn newlocale
+with an
+.Fa oldloc
+argument of
+.Dv LC_GLOBAL_LOCALE
+results in undefined behaviour.
Index: locale/newlocale.c
===================================================================
RCS file: /cvs/src/lib/libc/locale/newlocale.c,v
retrieving revision 1.1
diff -u -p -r1.1 newlocale.c
--- locale/newlocale.c 5 Sep 2017 03:16:13 -0000 1.1
+++ locale/newlocale.c 28 Mar 2019 13:56:37 -0000
@@ -22,8 +22,7 @@
#include "rune.h"
locale_t
-newlocale(int mask, const char *locname,
- locale_t oldloc __attribute__((__unused__)))
+newlocale(int mask, const char *locname, locale_t oldloc)
{
int ic, flag;
@@ -45,7 +44,7 @@ newlocale(int mask, const char *locname,
/* Only character encoding has thread-specific effects. */
if ((mask & LC_CTYPE_MASK) == 0)
- return _LOCALE_C;
+ return oldloc == _LOCALE_UTF8 ? _LOCALE_UTF8 : _LOCALE_C;
/* The following may initialize UTF-8 for later use. */
if ((locname = _get_locname(LC_CTYPE, locname)) == NULL) {