Pádraig Brady <p...@draigbrady.com> writes:

>> Thanks for the suggestion, but that doesn't work. Any issue with
>> skipping based on $host_os for this test and for fold-spaces.sh?
>> I was thinking of testing "printf '\u00A0' | ./src/tr -d
>> '[:blank:]'"
>> but that won't work since 'tr' operates on bytes and U+00A0 is
>> represented as 0xc2 0xa0 in UTF-8.
>
> Oh right sorry. wc has it's own iswnbspace,
> whereas fold essentially relies on the system iswblank.
>
> That means you could correlate with uniq though. Something like:
>
>   isblank() { test $(printf "a$1a\nb$1b\n" | uniq -f1 | wc -l) = 2; }
>   if ! isblank '\u2007'; then
>     # can test '\u2007' is treated as non breaking space
>   fi
>
> That would be a preferable way to gate the test.
>
> Though I'm thinking now we should adjust fold(1) a little
> to ensure we don't break with nbsp consistently across systems.
> I.e. move/rename iswnbspace() from wc.c to src/system.h
> and use it in fold (and wc) to give consistent behavior.
> I.e. fold would use: c32isblank() && ! c32isnbspace(),
> and the test would stay as is.

Thanks, I forgot about that function. That sounds like a good idea to
me. We can be nice to people who do not use glibc.

We will have to hoist the 'posixly_correct' check out of it before
though. Technically POSIX says that 'fold -s' should only break at
<blank> characters. But I rather avoid adding more
getenv ("POSIXLY_CORRECT") to programs that do not yet have them.

Collin



Reply via email to