Hello, I have mentioned a few times that I've been working on a library which implements POSIX functions for native Windows. A few days ago I got it to the point when I felt confident enough to publish it in its current state:
https://github.com/maiddaisuki/posix32 Here's a summary of what the library currently implements (mostly locale-related stuff): - locale.h functions - langinfo.h (nl_langinfo function) - ctype.h and wctype.h functions - string.h functions and their wchar.h equivalents - strings.h functions (str[n]casecmp) and their wchar.h equivalents - uchar.h functions (including c8rtomb/mbrtoc8, with both MSCVRT and UCRT) - stdlib.h/wchar.h C89/C95 conversion functions Notably, it still does not implement: - stdlib.h string-to-number conversion functions (which are locale-dependent) - time.h functions (strftime/wcsftime, tzset) - stdio.h functions and their wchar.h equivalents 1. locale.h POSIX newlocale, uselocale and getlocalename_l are implemented for all CRTs. `newlocale` can create locale_t object with UTF-8 even with CRTs which do not support UTF-8, and you can use locale-specific functions such as `mbrtoc32_l` (extension, not POSIX) with them. Library uses its own parser to parse strings passed to `setlocale` and `newlocale`. It is capable to parse any string that CRT's setlocale would accept, including Windows locale names and string using "ll[_CC][.CHARSET][@MODIFIER]" described by POSIX. For example, you can pass "ca_ES@valencia" or "sr_RS@latin", or even nonsense like "ja_US@cyrillic"/"ja-Latn-US" which will be parsed and resolved to "ja-JP". ".CHARSET" may also specify character set names such as "ISO-8859-1" instead of code page numbers, so that "en_US.ISO-8859-1" or "ja_JP.EUC-JP" will do exactly what you expect it to do. `setlocale (..., "")` and `newlocale (..., "", ...)` use LC_* and LANG environment variables if they are set, falling back to user's default locale if not. 2. string.h The library provides a few replacements: - strchr - strrchr - strstr - strpbrk - strspn - strcspn - strtok These replacements operate correctly on multibyte string. Library also implements its own `strtok_r` and `strndup`. 3. stdlib.h/wchar.h/uchar.h conversion functions Library provides replacements for all conversion functions, as well as locale-specific versions for use with locale_t. It also implements POSIX `wcsnrtombs` and `mbsnrtowcs`, and C23 `mbrtoc8` and `c8rtomb` functions. One notable difference from CRT's wc*tomb* functions is that they do not allow best-fit conversions, which is dangerous when it comes to things such as filenames. Library's uchar.h functions use active locale, unlike CRT's, which always use UTF-8 for conversion. - Kirill Makurin _______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
