Re: POSIX gettext() and uselocale()

2022-01-17 Thread Bruno Haible via austin-group-l at The Open Group
Geoff Clare wrote:
> The current draft says:
> 
> The returned string may be invalidated by a subsequent call to
> bind_textdomain_codeset(), bindtextdomain(), setlocale(), or
> textdomain() in the same process, or a subsequent call to
> uselocale() in the same thread, except for calls that only query
> values.
> 
> [...]
> 
> > I think that specifying gettext() to be so restricted is not useful.
> > It would make more sense to allow concurrent uselocale() calls.
> 
> The current draft text allows concurrent uselocale() calls.

This is better; thanks. Still, I don't think it is sufficient nor consistent.

OBJECTION 1:
  It requires applications to delegate some calls to separate threads.
  For example, take an application that regularly updates some UI and
  also occasionally writes an JSON file.

  For the UI updates, it will need to call gettext(). Let's assume that
  the UI caches the string the strings that the application passes it,
  e.g. for fast rerendering. This is the typical way a UI is built. E.g.
  Gtk+:   label1 = gtk_label_new (gettext ("Hello, world!"));
  Qt: label1 = new QLabel (gettext ("Hello, world!"), panel);

  For writing data in JSON format [1], it needs to convert
- strings to UTF-8 encoding,
- numbers to decimal representation, with '.' as decimal separator.
  For converting numbers to decimal, since the standard has strtod()
  but no strtod_l() [2], the most immediate implementation is to use
  uselocale() with a "C" locale argument, then call strtod(), then
  switch back to the previous locale using uselocale().

  With the current wording, converting a number to decimal like this
  will invalidate many of the strings that the UI is holding.

  Thus, the application will need to move its JSON file writing to a
  separate thread. This is a big architectural requirement.

OBJECTION 2:
  It is inconsistent with other parts of POSIX. For localeconv() [3]
  the wording is
"... might be overwritten by subsequent calls to setlocale() with the
 categories LC_ALL, LC_MONETARY, or LC_NUMERIC, or by calls to
 uselocale() which change the categories LC_MONETARY or LC_NUMERIC."

  To make things consistent, you would need to change the text for gettext
  from
"call to uselocale() in the same thread"
  to
"call to uselocale() in the same thread which changes the category
 LC_MESSAGES (for gettext(), gettext_l(), dgettext(), dgettext_l())
 respectively the locale passed to dcgettext(), dcgettext_l()"

Bruno

[1] https://datatracker.ietf.org/doc/html/rfc8259
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/strtod.html
[3] https://pubs.opengroup.org/onlinepubs/9699919799/functions/localeconv.html





Re: POSIX gettext() and uselocale()

2022-01-17 Thread Geoff Clare via austin-group-l at The Open Group
Bruno Haible wrote, on 16 Jan 2022:
>
> [First sent on 2021-05-03. Resending because it has not been handled.]

It has been handled.  This is how I reported the change to
austin-group-l on 25th May 2021 (in a reply to Jilles Tjoelker):

| In yesterday's teleconference we updated the proposed text to say
| that the returned string may be invalidated by a subsequent call to
| uselocale() in the same thread (and clarified that for the other
| functions it's a subsequent call in the same process).

> https://posix.rhansen.org/p/gettext_draft
> says (line 358):
> 
>   "The returned string may be invalidated by a subsequent call to
>bind_textdomain_codeset(), bindtextdomain(), setlocale(),
>textdomain(), or uselocale()."

The current draft says:

The returned string may be invalidated by a subsequent call to
bind_textdomain_codeset(), bindtextdomain(), setlocale(), or
textdomain() in the same process, or a subsequent call to
uselocale() in the same thread, except for calls that only query
values.

[...]

> I think that specifying gettext() to be so restricted is not useful.
> It would make more sense to allow concurrent uselocale() calls.

The current draft text allows concurrent uselocale() calls.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: POSIX gettext() and uselocale()

2022-01-16 Thread shwaresyst via austin-group-l at The Open Group
Historically, gettext domains are process wide, making use in multi-threaded 
apps problematic to begin with. The *_l versions only partially address this. 
The uselocale() interface is included there for the cases where a locale is 
used by both a uselocale() and one or more of the *_l versions, in that a 
second uselocale() call after the retrievals, with a different locale, may 
cause the memory mapping many implementations use for .mo files to be released 
on the next *_l call. Yes, it is not the call itself that causes these 
releases, or shouldn't, but as the root reason, imho, it should stay in the 
list. 
 
  On Sun, Jan 16, 2022 at 4:11 PM, Bruno Haible via austin-group-l at The Open 
Group wrote:   [First sent on 2021-05-03. 
Resending because it has not been handled.]

https://posix.rhansen.org/p/gettext_draft
says (line 358):

  "The returned string may be invalidated by a subsequent call to
  bind_textdomain_codeset(), bindtextdomain(), setlocale(),
  textdomain(), or uselocale()."

While in most programs setlocale(), textdomain(), bindtextdomain(),
bind_textdomain_codeset() are being called at the beginning of the
program execution, before any call to gettext(), the situation is
very different for uselocale().

1) uselocale() is meant to have effects ONLY on the thread in which it
  is called.

2) uselocale() is a helper function to implement *_l functions where
  the POSIX standard does not specify them or the system does not have
  them.
  For example, when a program wants to have a function to parse
  a number, recognizing only the ASCII digits and only '.' as decimal
  separator, a reliable way to implement such a function is by calling
  uselocale of the "C" locale, strtod(), and then uselocale() again
  to switch the thread back to the previous locale.

  If POSIX did not have uselocale(), it would need to provide many
  more *_l functions.

If the gettext() result may be invalidated by a uselocale() call (in
any other thread!), this would mean that

  ** Programs can use gettext() or uselocale() but not both. **

and - more or less -

  ** Multithreaded programs that use libraries (that may use uselocale())
    cannot use gettext(). **

I think that specifying gettext() to be so restricted is not useful.
It would make more sense to allow concurrent uselocale() calls.

Proposed wording:

  "The returned string may be invalidated by a subsequent call to
  bind_textdomain_codeset(), bindtextdomain(), setlocale(),
  or textdomain()."



  


POSIX gettext() and uselocale()

2022-01-16 Thread Bruno Haible via austin-group-l at The Open Group
[First sent on 2021-05-03. Resending because it has not been handled.]

https://posix.rhansen.org/p/gettext_draft
says (line 358):

  "The returned string may be invalidated by a subsequent call to
   bind_textdomain_codeset(), bindtextdomain(), setlocale(),
   textdomain(), or uselocale()."

While in most programs setlocale(), textdomain(), bindtextdomain(),
bind_textdomain_codeset() are being called at the beginning of the
program execution, before any call to gettext(), the situation is
very different for uselocale().

1) uselocale() is meant to have effects ONLY on the thread in which it
   is called.

2) uselocale() is a helper function to implement *_l functions where
   the POSIX standard does not specify them or the system does not have
   them.
   For example, when a program wants to have a function to parse
   a number, recognizing only the ASCII digits and only '.' as decimal
   separator, a reliable way to implement such a function is by calling
   uselocale of the "C" locale, strtod(), and then uselocale() again
   to switch the thread back to the previous locale.

   If POSIX did not have uselocale(), it would need to provide many
   more *_l functions.

If the gettext() result may be invalidated by a uselocale() call (in
any other thread!), this would mean that

  ** Programs can use gettext() or uselocale() but not both. **

and - more or less -

  ** Multithreaded programs that use libraries (that may use uselocale())
 cannot use gettext(). **

I think that specifying gettext() to be so restricted is not useful.
It would make more sense to allow concurrent uselocale() calls.

Proposed wording:

  "The returned string may be invalidated by a subsequent call to
   bind_textdomain_codeset(), bindtextdomain(), setlocale(),
   or textdomain()."





Re: POSIX gettext() and uselocale()

2021-05-25 Thread Geoff Clare via austin-group-l at The Open Group
Jilles Tjoelker wrote, on 24 May 2021:
>
> On Tue, May 04, 2021 at 01:07:39AM +0200, Bruno Haible via
> austin-group-l at The Open Group wrote:
> > https://posix.rhansen.org/p/gettext_split
> > says (line 92):
> 
> >   "The returned string may be invalidated by a subsequent call to
> >bind_textdomain_codeset(), bindtextdomain(), setlocale(),
> >textdomain(), or uselocale()."
> 
[...]
> 
> > I think that specifying gettext() to be so restricted is not useful.
> > It would make more sense to allow concurrent uselocale() calls.
> 
> > Proposed wording:
> 
> >   "The returned string may be invalidated by a subsequent call to
> >bind_textdomain_codeset(), bindtextdomain(), setlocale(),
> >or textdomain()."
> 
> This may be a bit too weak. Now the implementation can never free a
> string that was returned by a gettext call on a thread with uselocale()
> active, [...]

In yesterday's teleconference we updated the proposed text to say
that the returned string may be invalidated by a subsequent call to
uselocale() in the same thread (and clarified that for the other
functions it's a subsequent call in the same process).

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: POSIX gettext() and uselocale()

2021-05-24 Thread Jilles Tjoelker via austin-group-l at The Open Group
On Tue, May 04, 2021 at 01:07:39AM +0200, Bruno Haible via
austin-group-l at The Open Group wrote:
> https://posix.rhansen.org/p/gettext_split
> says (line 92):

>   "The returned string may be invalidated by a subsequent call to
>bind_textdomain_codeset(), bindtextdomain(), setlocale(),
>textdomain(), or uselocale()."

> While in most programs setlocale(), textdomain(), bindtextdomain(),
> bind_textdomain_codeset() are being called at the beginning of the
> program execution, before any call to gettext(), the situation is
> very different for uselocale().

> 1) uselocale() is meant to have effects ONLY on the thread in which it
>is called.

> 2) uselocale() is a helper function to implement *_l functions where
>the POSIX standard does not specify them or the system does not have
>them.
>For example, when a program wants to have a function to parse
>a number, recognizing only the ASCII digits and only '.' as decimal
>separator, a reliable way to implement such a function is by calling
>uselocale of the "C" locale, strtod(), and then uselocale() again
>to switch the thread back to the previous locale.

>If POSIX did not have uselocale(), it would need to provide many
>more *_l functions.

> If the gettext() result may be invalidated by a uselocale() call (in
> any other thread!), this would mean that

>   ** Programs can use gettext() or uselocale() but not both. **

> and - more or less -

>   ** Multithreaded programs that use libraries (that may use uselocale())
>  cannot use gettext(). **

> I think that specifying gettext() to be so restricted is not useful.
> It would make more sense to allow concurrent uselocale() calls.

> Proposed wording:

>   "The returned string may be invalidated by a subsequent call to
>bind_textdomain_codeset(), bindtextdomain(), setlocale(),
>or textdomain()."

This may be a bit too weak. Now the implementation can never free a
string that was returned by a gettext call on a thread with uselocale()
active, while logically the string may be owned by the locale and could
be freed if that locale is no longer set on any thread and freelocale()
has been called on it as needed.

-- 
Jilles Tjoelker



POSIX gettext() and uselocale()

2021-05-03 Thread Bruno Haible via austin-group-l at The Open Group
https://posix.rhansen.org/p/gettext_split
says (line 92):

  "The returned string may be invalidated by a subsequent call to
   bind_textdomain_codeset(), bindtextdomain(), setlocale(),
   textdomain(), or uselocale()."

While in most programs setlocale(), textdomain(), bindtextdomain(),
bind_textdomain_codeset() are being called at the beginning of the
program execution, before any call to gettext(), the situation is
very different for uselocale().

1) uselocale() is meant to have effects ONLY on the thread in which it
   is called.

2) uselocale() is a helper function to implement *_l functions where
   the POSIX standard does not specify them or the system does not have
   them.
   For example, when a program wants to have a function to parse
   a number, recognizing only the ASCII digits and only '.' as decimal
   separator, a reliable way to implement such a function is by calling
   uselocale of the "C" locale, strtod(), and then uselocale() again
   to switch the thread back to the previous locale.

   If POSIX did not have uselocale(), it would need to provide many
   more *_l functions.

If the gettext() result may be invalidated by a uselocale() call (in
any other thread!), this would mean that

  ** Programs can use gettext() or uselocale() but not both. **

and - more or less -

  ** Multithreaded programs that use libraries (that may use uselocale())
 cannot use gettext(). **

I think that specifying gettext() to be so restricted is not useful.
It would make more sense to allow concurrent uselocale() calls.

Proposed wording:

  "The returned string may be invalidated by a subsequent call to
   bind_textdomain_codeset(), bindtextdomain(), setlocale(),
   or textdomain()."

Bruno