Thanks for the reports and fixes in https://bugs.gnu.org/79301
A related question is, would it be useful to replace c32isblank() etc. to be IS30112¹ compliant or at least more standard? I.e. adjust c23isblank() to return true for: U+0009, U+0020, U+1680, U+180E?, U+2000..U+2006, U+2008..U+200A, U+205F, and U+3000 Then on musl, macOS etc. c32isblank() etc. would behave much like glibc? This is significant as uniq for example defines fields in terms of blanks. Note glibc iswblank() is a slightly difference set than 30112: 0009 TAB 0020 SPACE 1680 OGHAM SPACE MARK 2000 EN QUAD 2001 EM QUAD 2002 EN SPACE 2003 EM SPACE 2004 THREE-PER-EM SPACE 2005 FOUR-PER-EM SPACE 2006 SIX-PER-EM SPACE 2008 PUNCTUATION SPACE 2009 THIN SPACE 200A HAIR SPACE 205F MEDIUM MATHEMATICAL SPACE 3000 IDEOGRAPHIC SPACE i.e. it does not include: 180E 0 MONGOLIAN VOWEL SEPARATOR Though looking at https://util.unicode.org/UnicodeJsps/character.jsp?a=180E I'm not sure 30122 should be including 180E as blank? Anyway the point remains, to have a consistent set for users so that the tools don't change behavior with their data depending on what platform they're running on. cheers, Padraig ¹ https://www.open-std.org/JTC1/SC35/WG5/docs/30112d10.pdf
