On Sat, Sep 13, 2025, at 11:02 AM, Greg Wooledge wrote: > I really think this is a bad idea. A script needs to have predictable > behavior regardless of what bizarre locales may exist on the target > system.
Turns out that this doesn't even require a particularly "bizarre" locale to observe. ISO/IEC 8859-1 encodes NBSP as A0, so on macOS: $ export LC_ALL=en_US.ISO8859-1 $ [[ $'\xA0' = [[:blank:]] ]]; echo "$?" 0 $ eval set a$'\xA0'z; echo "$# args" 2 args Behaviors vary somewhat among shells. Some don't recognize A0 as a <blank>, zsh recognizes it as a <blank> but doesn't delimit tokens with it, and yash agrees with bash. (Various compatibility modes don't make a difference here.) $ cat /tmp/nbsp_test.sh nbsp=$(printf '\240') case $nbsp in [[:blank:]]) printf 'blank, ' ;; *) printf 'not blank, ' ;; esac eval set "a${nbsp}z" case $# in 1) echo not delimiting ;; 2) echo delimiting ;; esac $ export LC_ALL=en_US.ISO8859-1 $ /bin/bash /tmp/nbsp_test.sh # bash 3.2.57 blank, delimiting $ ~/build/bash/bash "$_" # bash devel blank, delimiting $ dash "$_" # dash 0.5.12 not blank, not delimiting $ /bin/ksh "$_" # ksh93u+ 2012-08-01 not blank, not delimiting $ ksh "$_" # ksh93u+m/1.0.10 2024-08-01 not blank, not delimiting $ mksh "$_" # mksh R59 2020/10/31 not blank, not delimiting $ oksh "$_" # OpenBSD 7.7 ksh not blank, not delimiting $ yash "$_" # yash 2.58.1 blank, delimiting $ zsh "$_" # zsh 5.9 blank, not delimiting POSIX seems to require delimiting on all <blank>s [*], without qualification. 7. If the current character is an unquoted <blank>, any token containing the previous character is delimited and the current character shall be discarded. yash takes this very seriously. $ export LC_ALL=en_US.UTF-8 $ [[ $'\uA0' = [[:blank:]] ]]; echo "$?" 0 $ bash -c 'set a'$'\uA0''z; echo "$# args"' 1 args $ yash -c 'set a'$'\uA0''z; echo "$# args"' 2 args [*] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/utilities/V3_chap02.html#tag_19_03 -- vq