Hi, I think I didn't do the things in right order. I started to submit small commits, but without a global view of the direction taken it is useless.
So I would like to present a roadmap in my intended work in libc locale area. If some developers could help or guide me with comments or suggestions, it would be great. First, why ? I would like to port libc++ (http://libcxx.llvm.org) to OpenBSD, in order to be able to have a llvm toolchain in ports that don't depend on gcc 4.9 (gcc-4.2 in base is unable to build recent llvm due to lack of c++11 support). We could build recent llvm (3.6.1) using gcc-4.9, but as a c++11 runtime is needed, the resulting binary is linked to libestdc++ (gcc-libs package). What is the relationship with libc/locale ? libc++ needs some POSIX functions in locale area that are missing in OpenBSD. These functions are uselocale(3), newlocale(3) and freelocale(3). (see http://pubs.opengroup.org/onlinepubs/9699919799/functions/newlocale.html for details). Basically it is per-thread support of locale. The intended work in libc/locale is to add these functions, and the related stuff only (locale_t type, ctype *_l functions [like isalnum_l]). My starting point is to considere global variables used in current code. Some are part of locale state (like current_categories in setlocale.c), others are temporary storage (like new_categories in setlocale.c). Temporary storage should be keep as local variable (for thread-safetly), whereas locale state variable should be transfered to locale_t (opaque type, internally a pointer to struct _locale_t). The global variables affected by setlocale(3) function are: - current_categories (locale/setlocale.c) - _CurrentRuneLocale (locale/runetable.c) - __mb_cur_max (locale/__mb_cur_max.c) - _ctype_ (gen/ctype_.c) - _toupper_tab_ (gen/toupper_.c) - _tolower_tab_ (gen/tolower_.c) Others global variables in locale area: - _CurrentMessagesLocale - _CurrentMonetaryLocale - _CurrentNumericLocale - _CurrentTimeLocale All of these should be part of locale_t, in order to be accessed in thread-safe manner, and to be able to be setted by newlocale(3) (the thread-safe counterpart of setlocale(3)). The second list (_CurrentMessagesLocale to _CurrentTimeLocale) is subject to discussion if it need to be part of _locale_t struct as these variables seems to be not modified. Some others elements need to be part of struct _locale_t: - magic_header: uselocale(3) need to be able to reject invalid locale object - marker to known if the object is installed as locale or not (created, but not passed to uselocale(3)). It would permit to manage ressource liberation in newlocale(3). A pivot function __get_locale() would be used to internally retrieve: - the global locale-state: struct _locale_t _locale_global_locale (and locale_t LC_GLOBAL_LOCALE = &_locale_global_locale) - the per-thread locale-state (if setted by uselocale(3)). This function would be weak to have different code if linked with pthread or not. The global state (_locale_global_locale) could be directly used internally by not thread-safe functions (like setlocale()). Others places should use __getlocale()->... idiom in order to retrieve the proper locale-state (global or per-thread state) in thread-safe manner. A rewrite of severals functions will be done, in order to transfert code in thread-safe functions, and call these functions with _locale_global_locale (or LC_GLOBAL_LOCALE). For example, having isalnum(3) based on isalnum_l(3), and isalnum_l() with mostly the code of isalnum(). I hope my explanation is understandable enough. Thanks. -- Sébastien Marie
