Hi,

I think I didn't do the things in right order. I started to submit small
commits, but without a global view of the direction taken it is useless.

So I would like to present a roadmap in my intended work in libc locale
area. If some developers could help or guide me with comments or 
suggestions, it would be great.


First, why ? I would like to port libc++ (http://libcxx.llvm.org) to
OpenBSD, in order to be able to have a llvm toolchain in ports that
don't depend on gcc 4.9 (gcc-4.2 in base is unable to build recent llvm
due to lack of c++11 support).

We could build recent llvm (3.6.1) using gcc-4.9, but as a c++11 runtime
is needed, the resulting binary is linked to libestdc++ (gcc-libs
package).


What is the relationship with libc/locale ? libc++ needs some POSIX
functions in locale area that are missing in OpenBSD. These functions
are uselocale(3), newlocale(3) and freelocale(3). (see
http://pubs.opengroup.org/onlinepubs/9699919799/functions/newlocale.html
for details). Basically it is per-thread support of locale.

The intended work in libc/locale is to add these functions, and the
related stuff only (locale_t type, ctype *_l functions [like
isalnum_l]).


My starting point is to considere global variables used in current
code. Some are part of locale state (like current_categories in
setlocale.c), others are temporary storage (like new_categories in
setlocale.c).

Temporary storage should be keep as local variable (for thread-safetly),
whereas locale state variable should be transfered to locale_t (opaque
type, internally a pointer to struct _locale_t).

The global variables affected by setlocale(3) function are:
  - current_categories (locale/setlocale.c)
  - _CurrentRuneLocale (locale/runetable.c)
  - __mb_cur_max (locale/__mb_cur_max.c)
  - _ctype_ (gen/ctype_.c)
  - _toupper_tab_ (gen/toupper_.c)
  - _tolower_tab_ (gen/tolower_.c)

Others global variables in locale area:
  - _CurrentMessagesLocale
  - _CurrentMonetaryLocale
  - _CurrentNumericLocale
  - _CurrentTimeLocale

All of these should be part of locale_t, in order to be accessed in
thread-safe manner, and to be able to be setted by newlocale(3) (the
thread-safe counterpart of setlocale(3)).

The second list (_CurrentMessagesLocale to _CurrentTimeLocale) is
subject to discussion if it need to be part of _locale_t struct as these
variables seems to be not modified.


Some others elements need to be part of struct _locale_t:
  - magic_header: uselocale(3) need to be able to reject invalid locale
    object

  - marker to known if the object is installed as locale or not
    (created, but not passed to uselocale(3)). It would permit to manage
    ressource liberation in newlocale(3).



A pivot function __get_locale() would be used to internally retrieve:
  - the global locale-state: struct _locale_t _locale_global_locale (and
    locale_t LC_GLOBAL_LOCALE = &_locale_global_locale)
  - the per-thread locale-state (if setted by uselocale(3)).

This function would be weak to have different code if linked with
pthread or not.

The global state (_locale_global_locale) could be directly used
internally by not thread-safe functions (like setlocale()). Others
places should use __getlocale()->... idiom in order to retrieve the
proper locale-state (global or per-thread state) in thread-safe manner.



A rewrite of severals functions will be done, in order to transfert code
in thread-safe functions, and call these functions with
_locale_global_locale (or LC_GLOBAL_LOCALE).

For example, having isalnum(3) based on isalnum_l(3), and isalnum_l()
with mostly the code of isalnum().


I hope my explanation is understandable enough.
Thanks.
-- 
Sébastien Marie

Reply via email to